Discounting, beyond Utilitarianism

Discounted utilitarianism and the Ramsey equation prevail in the debate on the discount rate on consumption. The utility discount rate is assumed to be constant and to reflect either the uncertainty about the existence of future generations or a pure preference for the present. The authors question the unique status of discounted utilitarianism and discuss the implications of alternative criteria addressing the key issues of equity in risky situations and variable population. To do so, they introduce a class of intertemporal social objectives, named Expected Prioritarian Equally Distributed Equivalent (EPEDE) criteria. The class is more flexible than Discounted utilitarianism in terms of population ethics and it disentangles risk aversion and inequality aversion. The authors show that these social objectives imply interesting modifications of the Ramsey formula, and shed new light on Weitzman’s “dismal theorem”. JEL D63


Introduction
How much should we save and invest for the future generations? The important stream of literature that addresses this question adopts the discounted utilitarian social welfare function and proceeds as follows. The intertemporal social objective takes the form where c t is, to simplify, the consumption of the representative agent of generation t. An investment from period 0 to period t that yields a sure rate of return r * is worth doing, in the margin, if i.e., if or approximately, the approximation being exact in continuous time. Equation (2) is know as the Ramsey formula (Ramsey, 1928). Two questions have been studied around it.
First, a prolific literature has studied the impact of uncertainty about future growth on the social discount rate. For instance, the Stern review (Stern, 2006, Part 1 -Appendix of Chapter 2) discusses the implications of increasing uncertainty on the social discount rate. Numerous papers (Gollier, 2002;Gollier and Weitzman, 2010;Gierlinger and Gollier, 2008;Traeger, 2012) have now shown that depending on the uncertainty on future growth, it may happen that 1 t ln Eu (c t ) is large when t is large. This occurs when the risk of an unfavorable growth path is substantial. In this case the threshold rate of return may be very line of argument justifies using a lower discount rate for long term projects than for short term projects.
The second question is to choose an appropriate value for the pure time discount rate δ and an appropriate calibration of the utility function u. Two conflicting views have emerged. 1 The so-called "ethical" or prescriptive approach advocates picking δ in such a way as to respect impartiality across generations.
As the only reason to introduce asymmetry across generations is the possibility of extinction, this approach advocates a very low value for δ. Indeed, if the probability of extinction follows a Poisson process, i.e., is the same magnitude p for every t, then the probability that extinction has not occurred before t is (1 − p) t , which is equal to 1 1+δ t when δ p. Following this reasoning, the Stern review takes δ = 0.001 per annum. The ethical approach also advocates calibrating u on the basis of inequality aversion within generations. For instance, if a function u (c) = 1 1 − η c 1−η is adopted, in absence of uncertainty about c t , the threshold return for the ethical approach is then equal to Even if p is low, one may end up with a high threshold if inequality aversion is high and the average growth rate between 0 and t is substantial. Specifically, η should be at least 2 if one considers it acceptable to perform a transfer from a rich to a poor that represents the same fraction of their initial consumption for each of them. With η ≥ 2 and an annual growth rate of two percent, the threshold is greater than 4%. 2 1 They are called the prescriptive and the descriptive approach by Arrow et al. (1996).
2 The Stern review adopts a low parameter η (equal to one) without much justification. This is The alternative ("descriptive" or "positive") approach advocates looking at revealed preferences (see, e.g., Nordhaus, 2008). It then seeks, on the basis of observed individual choices in markets, to estimate pure time preference for the calibration of δ and risk aversion and/or intertemporal elasticity of substitution for the calibration of η. A weakness of the revealed preference approach in this case is that it is hard to believe that individual preferences over risk and time within a lifetime should have much to say about the adjudication of conflicting interests across generations.
Another argument in favor of using the market rate is based on arbitrage.
Investing at a lower rate than the market rate yields returns that are dominated by the market returns. For instance, a climate policy with low returns appears dominated by a business-as-usual or a gradual policy that invests in economic activities with greater returns. With the latter policy, later generations will be richer and be able to allocate their greater resources between climate preservation and other uses. As a matter of fact, the arbitrage argument can be considered an objection against the ethical approach only if one misunderstands the latter. The ethical approach provides criteria that evaluate whether an investment improves social welfare. Typically such criteria satisfy the dominance principle that underlies the arbitrage argument. If all generations are made better off by a gradual policy than a more radical climate policy, a typical social welfare criterion will prefer this policy. An investment with greater returns has a greater net present value whatever the value of the discount rate that is used to compute the present value.
In view of these considerations, we consider that the ethical approach, based on the idea of relying on a well-founded social welfare criterion, is worth being not inconsistent with its stark insistence on impartiality across generations (because impartiality and inequality aversion are not the same), but it is nonetheless somewhat surprising.
pursued. But the classical framework in which the discounting problem is addressed in the literature, as summarized above, suffers from serious limitations.
In this paper, we study three important amendments to this framework.
First, the utilitarian criterion (1) is questionable in applications involving risk, although it has been advocated in a seminal paper by Harsanyi (1955).
In the presence of risk, the utility function u must have a specific concavity if it is meant to reflect the population's attitude about risk-taking, and this may be too restrictive for inequality aversion. In order to deal with this dilemma between excessive paternalism and insufficient inequality aversion, alternative, non-utilitarian, criteria for the evaluation of collective risks have recently been proposed (Bommier and Zuber 2008, Fleurbaey 2010, Fleurbaey and Zuber 2012, and in this paper we examine how to extend these alternative approaches to the context of long-run evaluation. Second, most of the literature on long term evaluation assumes an infinite horizon. Sometimes people consider a small probability of extinction, but this is thought as a minor amendment to the infinite horizon model. The literature on intergenerational equity has uncovered many troubling results that are entirely due to the presence of an infinite horizon in the model (see, e.g., Basu and Mitra, 2003;Lauwers, 2010;Zame, 2007). We advocate that such complications are superfluous because a more realistic assumption is that the life span of humankind 3 is finite and uncertain. Moreover, a good framework should be flexible about how the risk affecting growth and the uncertainty about the survival of the human species interact. This second point confirms the importance of finding social criteria which deal with risk in a satisfactory way.
Third, most of the literature on discounting assumes that every generation can be represented by a single agent. A better framework should allow for intra-3 The other species should ideally be considered as well. They are ignored in this paper.
generational inequalities and the welfare analysis should be individualistic, i.e., compute social welfare on the basis of individual well-being, not directly at the level of generations. This is not a contentious point, and many authors grant it (Schelling 1995, Stern 2007, and many others), but it seems important to make intragenerational inequalities part of the main model, instead of a minor extension, in order to frame the problem in a sound way.
The difficulty of our analysis is that it combines the hard ethical issues having to do with social choice under uncertainty and with a variable population -as an uncertain life span induces an uncertain size of the human species. Classical references in these fields are Broome (1991), Broome (2004), and Blackorby, Bossert and Donaldson (2005). We particularly build on previous work made by Bommier and Zuber (2008) and Blackorby, Bossert and Donaldson (2007) about variable population and uncertainty, and by Fleurbaey (2010) and Fleurbaey and Zuber (2012) about uncertainty. The paper by Blackorby, Bossert and Donaldson (2007) is formally the closest to our work and inspires our approach to the variable population problem, but they introduce strong Pareto and separability principles which impose the additive structure of utilitarianism, whereas we consider more general possibilities. Bommier and Zuber (2008) address a similar question and adopt a similar methodology as our paper, but they exclusively focus on the risk of extinction, assuming away any intragenerational inequality and any uncertainty about growth. They rely on a weaker Pareto principle than Blackorby, Bossert and Donaldson (2007) but even this version may not be satisfactory according to arguments in Fleurbaey (2010), as we will explain in the next section. Our proposed social objectives and the ethical principles dealing with uncertainty are inspired by Fleurbaey (2010) and Fleurbaey and Zuber (2012). 4 However, these papers do not address the problem of a variable population.
It turns out that covering variable population sizes requires rethinking some of the axioms used in discussions about equity and risk. In particular, axioms of independence with respect to individuals bearing no risk (or past generations), like those suggested by Fleurbaey and Zuber (2012), have to be adapted: if the size of the population bearing no risk is fixed, the size of the population bearing risks may vary, due to the risk on existence. This significantly modifies the analysis. The variable population framework also leads us to introduce an axiom of consistency between different population sizes. Another difference with the previous literature is that the proofs of our results involve different techniques than the one in Bommier and Zuber (2008) and Fleurbaey and Zuber (2012), who rely on a method proposed by Keeney and Raiffa (1976).
The paper is organized as follows. Section 2 introduces three families of social objectives which generalize the utilitarian criterion and a justification is provided for each of them. Section 3 derives the implications of each of these social objectives for the discount rate and shows that the Ramsey formulae (2) and (3) need to be supplemented with additional terms involving the relative well-being of future individuals and the correlation between their well-being (or their existence) and social welfare. Section 4 derives the implications of these new criteria for the question of catastrophic risks which has been studied in the utilitarian context by Weitzman (2009). We show that the dismal theorem can be made more pressing with some non-utilitarian criteria, but at the cost of a social aversion to catastrophes that goes against individual preferences. Section 5 concludes. An Appendix contains the proofs of the results of Section 2.
of consequentialism. In contrast, we are content with the expected utility model as a decision criterion under risk.
2 Social objectives

The framework
We let N denote the set of positive integers, N 3 the set of integers starting from 3, N 3 the set of subsets of N with cardinality at least 3, R the set of real numbers, and R + the set of non-negative real numbers. For a set S and any n ∈ N, S n is the n-fold Cartesian product of S.
The set of potential individuals (who may or may not exist) is N. In the definition of a person, we include all his or her relevant characteristics (gender, birthplace, and so forth) and in particular the generation it belongs to. Hence there exists a mapping T : N → N that associates to each individual i the date he or she will be born provided he or she comes to life, T (i).
In contrast with Blackorby, Bossert and Donaldson (2007), we work directly with utility numbers. Hence an alternative u is a collection of utility numbers, one for each individual alive in the alternative. Let X be a closed interval in R.
We let U = n∈N 3 X n denote the set of possible alternatives -an alternative is a vector of utility numbers, one for each individual living in that particular alternative. Note that we restrict attention to situations in which the population has at least three members. In a variable-population framework, the size of the population may vary from one alternative to another. It is important to keep track of the population in an alternative. For any u ∈ U , we let N (u) be the set of individuals in the alternative and n(u) = |N (u)| be the number of individuals in the alternative.
Uncertainty is described by m ∈ N \ {1} states of the world. A prospect is a vector belonging to the set U = U m with typical element u = (u 1 , · · · , u m ). A lottery is the combination of a probability vector p = (p 1 , · · · , p m ) ∈ Σ m−1 -where Σ m−1 denotes the closed (m − 1)-simplex -and a prospect u. The set of lotteries is denoted We choose to work with lotteries rather than prospects even though, in principle, all lotteries can be reformulated as prospects for a suitable partition of states of the world. The reason is that it is convenient in applications to be able to describe possible scenarios not only in terms of varying consequences, but also in terms of different probabilities over a small set of identified states of the world. This makes our analysis more amenable to applications. In particular, the role of probabilities in the determination of the discount rate will be much more transparent in this way.
For a prospect u, whenever i ∈ N (u s ) for s ∈ {1, · · · , m}, u i s denotes the utility of individual i in state of the world s. For a subpopulation N ⊂ N, we denote by U N the set of prospects such that, for any u ∈ U N , and any s ∈ {1, · · · , m}, N (u s ) = N . These are the prospects such that the same individuals are present in all states of the world. In this case, u i = (u i 1 , · · · , u i m ) represents the prospect of individual i. We let 1 m be the unit vector of R m .
Our problem is to define a social ordering, i.e., a transitive and complete binary relation R on L. The expression (p, u)R(q, v) will mean that (p, u) is at least as good as (q, v). We let P and I denote the corresponding strict preference and indifference relations.
More precisely, our problem is to select reasonable orderings among the myriad possible orderings of this set. The standard way of making such a selection is to list basic requirements (axioms) that embody appealing ethical principles, and to seek the orderings that satisfy such requirements.

Principles
We first want R to be as rational as one could be, given that it serves for a reasoned evaluation of social situations. The expected utility criterion, in spite of many criticisms, remains the benchmark of rational decision-making under risk and the following axiom requires R to take the form of expected (social) utility.
Axiom 1 (Social expected utility hypothesis) There exists a continuous function V : U → R such that, for all (p, u), (q, v) ∈ L: One limitation implied by this axiom is that it prevents R from evaluating what happens in one state of the world taking into account what would have happened in other states. In this fashion, ex ante fairness in lotteries (Diamond, 1967) is ignored, unless the utility numbers u s in any given state do incorporate a measure of the chances that individuals had in other states. It is formally easy to generalize the criterion and rewrite it as m s=1 p s V s (p, u) , but it is then difficult to come up with a precise proposal for the state-specific functions V s that would evaluate the consequences in state s as a function of the whole lottery (p, u) (see Fleurbaey, Gajdos and Zuber, 2010).
The next axiom is a standard anonymity requirement.
The Pareto principle is the hallmark of social evaluation, but the principle of consumer sovereignty is normally invoked when the individuals are fully informed about the options. In the presence of risk, by definition the individuals do not know what will ultimately happen if they choose such or such option, so that respecting their ex ante preferences is less compelling than under full information.
In particular, there are situations in which the distribution of final situations across individuals is known ex ante, while it is only the identity of winners and losers that is not known. In such situations, the ignorant individuals may all be willing to take a risk, but everyone knows that it is not in the interest of the ultimate losers and everyone knows that this ex ante unanimous preference for a risky lottery will break down as soon as uncertainty is resolved. In view of such considerations, we restrict the application of the Pareto principle to situations in which such a breakdown of unanimity with greater information cannot occur.
Two cases are retained here. There is first the case of risk-free prospects, in which full information about final utilities prevails.
Axiom 3 (Pareto for no risk) For all N ∈N 3 , for all p, q ∈ Σ m−1 and u, v ∈ U N such that for all i ∈ N , there is u(i), v(i) ∈ X such that u i = u(i)1 m and Second, there is the case in which all individuals share exactly the same fate in all states of the world. They may ultimately regret 5 having taken a risk if they are unlucky, but they will unanimously do so. Axiom 4 (Pareto for equal risk) For all N ∈N 3 , for all p, q ∈ Σ m−1 , for all u, v ∈ U N such that for all s ∈ {1, · · · , m} there is u(s), v(s) ∈ X such that We also introduce some requirements of subpopulation separability. The motivation for such axioms is primarily a matter of simplicity. Under separability it is possible to perform the evaluation of a certain change affecting a particular population (e.g., the present and future generations) independently of the rest of the population that is not concerned (e.g., the past generations). The first separability axiom applies to riskless prospects in which utility is the same in all states.
Axiom 5 (Separability for sure prospects) For all N ∈N 3 , for all p, q ∈ Σ m−1 , for all u, v,ũ,ṽ ∈ U N , if there exists M such that M ⊂ N and For prospects that involve risk, separability is a more delicate notion because the presence of other agents who may be unconcerned but display a certain correlation with the concerned population may appear relevant to the evaluation.
In particular, it makes sense to prefer prospects with positive correlation across agents (solidarity of fate) to prospects with negative correlation. Therefore we restrict the application of separability to situations in which the unconcerned population bears no risk at all. This is typically the situation of past generations, for instance. Violating this axiom would then require taking account of the level of utility of ancient populations in order to assess future policies -a quite cumbersome obligation.
Axiom 6 (Independence of the utility of the sure) For all p, q ∈ Σ m−1 , Note that Axiom 6 is stronger than Axiom 5 not just because it applies to risky prospects, but also because it allows for situations in which the concerned populations are not the same in u and v. In this respect, Axiom 6 also differs from the Independence of the Utility of the Sure in Fleurbaey and Zuber (2012), which would correspond in the present framework to the following weaker axiom: Axiom 7 (Same number independence of the utility of the sure) For all N ∈ N 3 , for all p, q ∈ Σ m−1 , for all u, v,ũ,ṽ ∈ U N , if there exists M such that, for all M ⊂ N , and Axiom 6 is a rather natural extension of the idea of separability. If we consider two possible scenarios for the future population, with different individuals being born in each scenario, it still makes sense to disregard the past generations. It may therefore seem appropriate to use Axiom 6 rather than Axiom 7.
Our last axiom deals with the comparison of populations with different sizes.
We want to be as flexible as possible and simply require the comparison to be possible in a certain systematic way. This is the role of the "critical-level" function.
Axiom 8 (Critical-level consistency) There exists a function C : R × N → X such that for all p ∈ Σ m−1 , for all u, v ∈ U, for all s ∈ {1, · · · , m} and for all Compared to the Critical-Level Consistency Axiom by Blackorby, Bossert and Donaldson (2007), Axiom 8 imposes that the critical level depends on the welfare of the society without the additional individual rather than the whole vector u s . This is more restrictive but it seems reasonable to argue that if we replace u s with another vectorũ s such that V (ũ s ) = V (u s ) and n (ũ s ) = n (u s ) , there is no reason to change the critical level.

Three families of social objectives
As a preliminary, it is worth mentioning that standard utilitarianism (with a critical level) is the only one to satisfy all the axioms introduced in the previous section. We state this as a separate proposition but it is simply a corollary of Propositions 1 and 2 below.
Proposition 0 The social ordering R satisfies Axioms 1, 2, 3, 4, 6 and 8 iff there exists a scalar u c ∈ X such that The problem with this formula is of course that it does not yield any priority to the worst-off in utility. To avoid this conclusion, some of the axioms must be dropped. We first state a lemma which shows the implications of separability for the ex post evaluation of final distributions of utilities. All the proofs of this section are in the Appendix.
Lemma 1 If the social ordering R satisfies Axioms 1, 2, 3, and 5 then, for all n ∈ N 3 , there exist continuous and increasing functions Ψ n and φ n such that, for all u s such that n(u s ) = n, The following proposition identifies a first family of social objectives. It involves the expected value of the equally-distributed equivalent (EDE). Recall that for any given distribution of utilities across individuals, its equally-distributed equivalent is the utility level that, were it equally enjoyed by all individuals, would generate the same level of social welfare as the contemplated distribution.
Proposition 1 The social ordering R satisfies Axioms 1, 2, 3, 4, 5 and 8 iff there exists a continuous increasing function φ and a sequence (α n , β n ) ∈ R ++ × R such that for all (p, u), (q, v) ∈ L: and the sequence (α n , β n ) , n ≥ 3 satisfies the recursive property: where C is the critical-level function of Axiom 8.
Proposition 1 extends previous results by Fleurbaey (2010) to the variable population framework. Doing so, it provides a more specific form for the EDE function and shows how it relates to the VNM function for different population sizes. It also provides specific results on the form of the critical-level function.
Indeed, the recursive property (4) identified in this Proposition is quite constraining. If for all n, α n = α and β n = β, Equation (4) implies which means that the critical level is equal to the EDE. When the social ordering exhibits a strong aversion to inequality, the EDE is close to the lowest utility, which implies that it is then considered acceptable to add new members to society only when their utility level is above the minimum.
The critical level can also be independent of V (u s ), but here again the constraints are substantial, as stated in the following result.
Corollary 1 If the critical-level function C (x, n) associated with the social ordering defined in Prop. 1 is independent of x and if the social ordering is averse to inequality (φ strictly concave), then C satisfies the following properties : 1. Either C (x, n) = min X for all x ∈ X and for all n ∈ N 3 ; or C (x, n) = max X for all x ∈ X and for all n ∈ N 3 .
2. If X is not bounded above, then C (x, n) = min X for all x ∈ X and for all n ∈ N 3 and there exists such that 0 < < 1 and for all (p, u), (q, v) ∈ L: .
(In this case, α n = n The case C (x, n) = max X is not palatable, as it implies a strong form of anti-populationism. It is noteworthy that the social ordering highlighted in point 2 has limited inequality aversion, as 0 < < 1.
It is possible to have C (x, n) = c > min X when φ is no longer required to be concave everywhere and the social ordering is inequality averse above the critical level but inequality prone below it, i.e., with a formula like Although this is a controversial form of social ordering, a possible justification for it is that it focuses on raising individuals above the critical level c, even if this means sacrificing those who cannot make it (this is a triage approach discussed in Roemer, 2004).
For the analysis of the discount rate in the next section, what is important is the behavior of α n /n. In this respect, we have the following result: Corollary 2 If φ is concave and differentiable on X, C(x, n) is non-decreasing in x, and C (α n min X + β n , n) ≥ min X, then α n /n is non-increasing in n.
This may seem a general result but the assumption that φ is differentiable at min X is not innocuous in terms of inequality aversion. Observe that in case 2 of Corollary 1, in particular, α n /n is increasing in n. When α n is a constant, α n /n is decreasing in n. Therefore both cases are possible.
An inconvenient feature of the family of social orderings singled out in Proposition 1 is that they do not satisfy Axiom 6 and therefore generally require taking account of seemingly unconcerned individuals (such as the members of past generations) in the evaluation. It is however possible to use the weakened version of Independence of the Utility of the Sure, namely Axiom 7, to obtain the following result: Corollary 3 The social ordering R satisfies Axioms 1, 2, 3, 4, 7 and 8 iff: 1. Either for all (p, u), (q, v) ∈ L: Two remarks can be made concerning Corollary 3: • In the multiplicative case, the representation is valid only if we restrict X to be a subset of (−∞, − Ω κ ) when κ < 0 or a subset of (− Ω κ , +∞) when κ > 0. When X = R, the social ordering R satisfies Axioms 1, 2, 3, 7 and 8 if and only if the additive representation holds.
• If we require ex-post preferences to exhibit inequality aversion, only the multiplicative case with κ > 0 is admissible.
If one thinks of sure individuals as past generations, the second representation in Corollary 3 depends on the past only through the number of persons who have lived in the past. If one wants to avoid even this dependence, one must strengthen Axiom 7. For this purpose, we now reintroduce Axiom 6, but drop Axiom 4 momentarily, in order to identify two other interesting families. The two families are extensions to variable populations of two families of social orderings singled out in Fleurbaey and Zuber (2012).
Proposition 2 The social ordering R satisfies Axioms 1, 2, 3, 6 and 8 iff: 1. Either there exist a scalar u c and a continuous and increasing function φ such that for all (p, u), (q, v) ∈ L: 2. Or there exist a scalar u c , a non-zero scalar γ, and a continuous and in-creasing function φ such that for all (p, u), (q, v) ∈ L: Although it singles out an additive and a multiplicative family of social welfare functions, Proposition 2 differs from results in Fleurbaey and Zuber (2012) because it does not use a Pareto for (Restricted Subgroup) Equal Risk Axiom.
The technique of proof is therefore different and involves functional equations.
The exponential representation of Proposition 2 can be interpreted in terms is considered an ex post measure of social welfare, the parameter γ can be viewed as expressing an attitude towards the risk on social welfare. Hence γ < 0 corresponds to risk aversion while γ > 0 corresponds to risk loving. Therefore, if we assume that φ is concave (inequality aversion), the difference with the EDE case (which involves the convex transformation φ −1 ) is that we can have a concave function of social welfare, that is, the risk averse case.
When the risk only bears on the horizon, we can interpret γ as capturing risk aversion with respect to population size. The case γ < 0 corresponds to "catastrophe avoidance", i.e., a preference for guaranteeing the existence of some generations rather than taking the risk on an earlier end of history, whereas the case γ > 0 corresponds to "risk equity", implying a preference for spreading the risk of existence over more generations, thereby increasing the probability of existence of later generations (Bommier and Zuber, 2008). More generally, γ also represents social attitudes towards the correlation of outcomes not just across generations but also within, and, in the case of independent individual risks, it embodies social attitudes about the distribution of such risks. If the society prefers people in general to have similar outcomes, that is, if it values positive correlation, or if it values a more equal distribution of independent risks, this is captured by γ > 0 (risk equity).
If one added Axiom 4 to the list in Proposition 2, the exponential case would be excluded because it would require the function φ to be such that the mapping u i s → exp (γnφ(u i s )) be affine in u i s , which is impossible because this would require φ to depend on n. Moreover, the additive case would have to satisfy that φ(u i s ) is affine in u i s , implying that the criterion boils down to utilitarianism in samenumber cases.
It is, however, possible to combine inequality aversion and a limited respect for ex ante individual preferences. The following axiom guarantees that individual preferences are respected when only one individual takes a risk that does not affect the unconcerned others.
Axiom 9 (Pareto for individual risk) For all N ∈N 3 , for all p ∈ Σ m−1 , for all u, v ∈ U N , for all i ∈ N , if for all j ∈ N \ {i} there exists u(j) ∈ R such that When Axiom 9 is added to Proposition 2, we obtain two classes of social welfare functions which are very similar to the ones in Bommier and Zuber (2008).
Corollary 4 The social ordering R satisfies Axioms 1, 2, 3, 6, 8 and 9 iff: 1. Either there exists a scalar u c such that for all (p, u), (q, v) ∈ L: 2. Or there exist a scalar u c , Λ ∈ R \ {0} and λ ∈ R such that for all (p, u), (q, v) ∈ L: One can notice the similarity between the representations put forward in Corollary 4 and those obtained in Corollary 3. The same remarks as the one made after Corollary 3 therefore apply. The choice between the alternative representations hinges on the respective weight one puts on Pareto for Equal Risk versus Independence of the existence (or utility) of the sure.
One may regret that we propose three families instead of identifying a single approach. We believe that our work is to clarify the ethical issues, not to advocate particular solutions. One may argue that the criteria identified in Proposition 1 and Corollaries 1 and 3 are good compromises between inequality aversion and the Pareto principle. But if one is attracted by stronger separability properties and is willing to drop the Pareto principle in order to keep sufficient inequality aversion, the criteria characterized in Proposition 2 and Corollary 4 stand out as the only alternative possibilities.

Implications for the discount rate
In this section we derive the social discount rate for the three families highlighted in Propositions 1 and 2. In the utilitarian approach exemplified in (2) and (3), the discount rate on consumption is the simple addition of a discount rate on utility and a specific term relative to consumption, combining the growth rate of consumption and the rate of decrease of marginal utility. As we will see, not only is the utility discount rate modified with the alternative criteria proposed here, but the consumption term is also generally different because the simple additive structure of (2) and (3) is due to the additive structure of the utilitarian criterion and to specific assumptions about the nature of the risk and the inequalities in consumption.

Social and person-to-person discount rates
To start with, consider a social welfare function taking the general form where u s = (u (c i s )) i∈N (us) , and c i s , the consumption of individual i in state s, is a real number. We assume here that all individuals have the same utility function u. Extending the analysis to the case of heterogeneous utility functions is cumbersome but straightforward.
When individuals, not generations, are the constitutive elements of social welfare, the discount rate must be computed primarily between two individuals.

Definition 1
The person-to-person discount rate from an individual i in period 0 to an individual j in period t, denoted ρ i,j t , is: To understand this definition, imagine that today (period 0) individual i can make an investment whose sure rate of return is r for the benefit of individual j living in period t. In the margin, such an investment has no effect on social welfare if: with the convention that ∂V (u s ) /∂u j s = 0 if j does not exist in state s. Observe that, while the existence and consumption of i in period 0 is certain, the social marginal utility ∂V (u s ) /∂u i s may vary across states of the world. When many individuals from period 0 make an investment that benefits many individuals in period t, one can evaluate the investment with a social discount rate that aggregates the person-to-person discount rates, provided the shares of the individuals in the investment (either as investors or as beneficiaries) are fixed. Suppose that each donor i in period 0 bears a fraction σ i 0 of the marginal investment ε, and that each recipient j in period t receives a fraction σ j t , with i σ i 0 = j σ j t = 1. The social discount rate is again the sure rate of return on the marginal investment that leaves social welfare unchanged.
Definition 2 The social discount rate from period 0 to period t, denoted ρ t , is: In the sequel we focus on the computation of the person-to-person discount rate. Let us briefly examine how (5) applies in the utilitarian case, before examining the alternative criteria characterized in the previous section. When V is the critical-level utilitarian social welfare function i∈N (us) (u i s − u c ), one has ∂V /∂u i s = ∂V /∂u j s = 1 for the individuals of period 0 and for the individuals of period t in the states in which they exist. This considerably simplifies the formula and one obtains where p(j) is the probability that individual j exists and E j is the expected value conditional on j's existence. Noticeably, the critical level plays no role in the value of the discount rate.

The Ramsey formula revisited
We obtain the following general results for the families of social welfare functions introduced in the previous section. The order of presentation follows the increasing order of refinements to the Ramsey formula.
Proposition 3 For the family of additive social welfare functions the person-to-person discount rate can be approximated in the following way: where µ j s = φ (u j s )u (c j s ) and µ i = φ (u i )u (c i ).
Proof. For the additive family, .
In Equation (7), µ j s and µ i respectively denote the social priority of increasing the consumption of j in state s and the social priority of increasing the consumption of i. (7) can be decomposed in order to compare it with the utilitarian formula. One has

The second term of Equation
where Cov j is the covariance conditional on j existing. From the first line one sees that if u(c j s ) > u(c i ) for all s, then Θ > 0. On the other hand, from the second line one sees that if Eφ (u(c j s )) is not very much lower than φ (u(c i )) and if the risk on c j s is high while the concavity of φ and u is strong (implying a great covariance term), one will have Θ < 0.
In the additive case, the person-to-person social discount rate between i at period 0 and j at period t is therefore lower: • the more likely it is that j exists; • the less well-off j is on average (conditional on j existing), relative to i.
The additive case is obviously the closest to utilitarianism, the only difference coming from the inequality aversion that is incorporated when φ is decreasing.
As with utilitarianism, the critical level u c has no influence at all on the discount rate in this approach.
With the exponential class of social orderings, one obtains additional terms in the formula for the discount rate.
Proposition 4 For the family of exponential social welfare functions the person-to-person discount rate can be approximated in the following way: Proof. In the case of the exponential family, we have: One has: Approximation (8) indicates that the person-to-person discount rate between i at period 0 and j at period t is lower: • the more likely it is that j exists; • the less well-off j is on average relative to i; • the lower the covariance between the state of the population and j's wellbeing; • the better-off the population is in the states in which j exists.
The new terms capture the fact that j's well-being gets all the more priority as the rest of the population is better off. The covariance term is the covariance of the relative variations of µ j s and ξ s . The last term is structurally similar because it involves the covariance between j's existence and ξ s . Indeed, one has: where 1 j s = 1 if j ∈ N (u s ) and 0 otherwise.
Proposition 5 For the family of EDE social welfare functions the person-to-person discount rate can be approximated in the following way: where ν s = α n(us) n(us)φ (e(us)) .
Proof. For the EDE family, we have: φ (e(us)) s:j∈N (us) ps α n(us) Then we can proceed as in the proof of Proposition 4 to obtain the result.
One obtains an approximate formula (9) which is structurally very similar to approximation (8) obtained with the exponential class. Like ξ s , ν s this expression depends on the state of the population and increases with the well-being of the population as measured by the EDE.
The novelty however is the role of population size in ν s , which is determined in part by the critical level. Indeed, the term ν s may be increasing or decreasing with the population size depending on the formula for the critical level. It is in particular decreasing if α n is a constant, which is the case when the critical level is equal to e (u s ). It is however increasing if α n = n 1/(1− ) , which is the case when the critical level of consumption is constant and equal to min X. This can be easily explained when V (u s ) takes the simple form .
The sensitiveness to the utility of a single individual indeed increases with the size of the population in that case. Intuitively, observe that the derivative of which is increasing in the number of individuals, whereas the derivative of 1 i /n, which is decreasing in n. In the end, we have the following observations in the case of the EDE family.
The personalized social discount rate between i at period 0 and j at period t is therefore lower: • the more likely it is that j exists; • the worse-off j is relative to i; • the greater the covariance between the priority of j and v s ; • the greater v s is in the states in which j exists.
Informally, the last two terms mean that the discount rate is lower: • the lower the covariance between the well-being of j and the well-being of the population; • the better-off the population is in the states in which j exists; • the greater the covariance between j's well-being and the size of the population, if the critical level is equal to e(u s ); the lower the covariance between j's well-being and the size of the population, if the critical level of consumption is min X; • the smaller the population is in the states in which j exists, if the critical level is equal to e(u s ); the larger the population is in the states in which j exists, if the critical level of consumption is min X.
From Propositions 4 and 5 we obtain the conclusion that an investment should be evaluated with a lower discount rate when its benefits in the future will be enjoyed more by members of the population that are badly-off and by beneficiaries whose well-being is inversely correlated with social welfare. Note that under a strong degree of inequality aversion, the worse-off's well-being is positively correlated with social welfare. Therefore the two aspects may tend to counterbalance each other. However, the correlation term should be less important when there is an inverse correlation between the poor and the rich than in the case of a positive correlation. Weitzman (2009) suggests that the discount rate can approach −1, implying an absolute priority of future consumption (the "dismal theorem"), in the presence of fat tails in the distribution of risk. His argument relies on the utilitarian criterion, and in this section we reexamine it with the other criteria introduced here.

Catastrophic risk
Weitzman's basic line of reasoning is as follows. The utilitarian discount rate, without approximation, satisfies the equation: The critical term for the argument is E j u (c j s ), which, in the case of a CRRA function u(c) = 1 1−η c 1−η , η > 1, and a continuous distribution of c depicted by a PDF f (c), is equal to c −η f (c)dc. If one changes variables so as to refer to a growth rate, c = c 0 e gt , the formula becomes This is essentially the moment-generating function off , and it is infinite iff has a fat tail (in the negative values representing catastrophic risks).
A fat tail means thatf (g) ∝ (−g) −k for some k > 0 when g → −∞. Note that f cannot have a fat tail in the low values of c because c is bounded from below. What happens, however, is that one has f (c) =f 1 t ln c c 0 ∝ −k ln c when c → 0. 6 Such a PDF, for instance, has the property that, conditional on c < q, the probability of c < q/2 remains above 50% when q → 0.
The fact that c is bounded from below suggests that one does not really need to invoke fat tails on temperatures to support an argument in favor of giving an absolute priority to the future. Supposing that u (0) = −∞, it is enough to have a positive probability of c t = 0 (or g = −∞) to make it an absolute priority to transfer resources to raise c t above zero. More generally, if there is a subsistence level c min such that u (c min ) = −∞, it is an absolute priority to raise c above c min at any period. 6 The integral q 0 c −η ln cdc does not converge when η ≥ 1.
Given the frightening worst-case scenarios involving temperature increases above +10 • C or +20 • C, it is not unreasonable to assign a positive probability to the event of having a substantial part of the population at subsistence level in future generations. 7 The weakness of the argument in the preceding paragraph is, rather, the assumption of infinite marginal utility. A typical form for the utility function could be u(c) = 1 1−η c 1−η − (c min ) 1−η , which has a finite marginal utility at c min > 0. With such a function, the utilitarian discount rate remains finite even when the probability of c = c min is positive. 8 Let us see if the alternative criteria introduced in this paper shed a different light than utilitarianism on this issue. Consider the additive criterion With this criterion, the discount rate tends to −1 when E j φ (u j s )u (c j s ) → +∞. If u(c min ) = 0 and φ (0) = +∞, it becomes an absolute priority to raise c above c min even when u (c min ) is finite. We therefore see a different possible argument for this conclusion. A high degree of inequality aversion suffices to give the individuals at the subsistence level an absolute priority over those who are better off.
This line of argument, however, no longer works with the EDE criterion One should not forget that this is already the case today, for reasons having little to do with the climate. As Schelling (1995) argued, if the possible poverty of future generations is the reason to give them priority, we should give a stronger priority to the poor who exist today with certainty.
because the critical term is then .
In this term, the ratio φ (u j s )/φ (e (u s )) does not necessarily tend to infinity when u j s → 0 because one may then have e (u s ) → 0. In particular, consider φ(x) = 1 1−ε x 1−ε , with ε > 1. Then, when m individuals over n have equal utility that converges to zero, u j s → 0, while the others have positive utilities, one has The difference between the additive criterion and the EDE criterion, as shown by the axiomatic results, involves the Pareto criterion. When the individuals have a finite marginal willingness to pay to reduce the risk of falling below subsistence (which seems to be the case), the utilitarian criterion and the EDE criterion, which respect individual preferences over risk (provided there is no inequality, for the EDE), will not give an absolute priority to avoiding this risk for future generations. In contrast, the additive criterion, with the φ function, introduces an extra risk aversion linked to inequality aversion, which may impose such a priority against the preferences of the population.
The exponential criteria behave in one way or the other depending on how the parameter γ modifies social risk aversion. A positive γ (risk equity) will tend to tolerate the risk of a catastrophe, whereas a negative γ (catastrophe avoidance) will display a strong risk aversion induced by the combination of the φ function and the exponential.
In conclusion, the "dismal theorem" is often criticized for involving fat tails.
But fat tails are less needed than a positive probability of a big catastrophe, which is in fact consensual. The main weakness of the theorem is rather on the utility side. Criteria that respect individual preferences over risk tolerate the risk of big catastrophes as much as individuals tolerate the risk of individual catastrophes.
But the additive and exponential criteria do offer the possibility to obtain a strong aversion to catastrophes -at the cost of going against individual preferences.

Conclusion
The purpose of this paper was threefold. First, we introduced a general framework in which the horizon is finite but uncertain, and uncertainty bears on future utility as well as on the composition of the future population. Second, in this framework we characterized non-utilitarian criteria which embody a greater concern for equity than utilitarianism, at the cost of relaxing either separability properties or the Pareto principle. Third, the analysis remained individualistic throughout, highlighting the specific level and correlation characteristics of individuals that determine the person-to-person discount rates.
Our most general finding concerning discounting is that the social discount rate should take account of the distribution of the benefits and costs of the investments across individuals and across states of the world. Generally the evaluation is all the more favorable (i.e., the discount rate is lower) as the investment benefits individuals who are worse off, whose well-being is inversely correlated to the well-being of the population, and pays more in states in which the beneficiaries are worse off.
The role of correlations between individual and social well-being as an important factor in evaluations is our key contribution to the refinement of the Ramsey formula. Benefiting an individual who is badly off when the population is well off has a greater impact on social welfare, on average, than benefitting an indi-vidual who is badly off when the population is also badly off. This may seem disturbing because it seems to give a bonus to the states of the world in which the population is relatively well off. This occurs, however, only in the very special trade-off between helping a poor with a positive correlation with social welfare and a poor with a negative correlation. But most policy issues affect broader populations. Suppose one invests in a public good that is useful mostly in bad states (e.g., flood protection). When a bad state occurs, the investment benefits more individuals who are badly off. Even if the correlation between their wellbeing and social welfare is high, the fact that the investment benefits many badly off individuals may be sufficient to give it a greater social value than a similar investment that would create a public good suited to good states (e.g., a new transportation infrastructure).
Concerning the effect of inequality aversion on social discounting, it is known that inequality aversion increases discounting when future generations are betteroff. It is also known that when growth is uncertain, and there is a substantial risk of future generations being less well-off, a higher inequality aversion can on the contrary decrease the discount rate. Our more general approach adds that, if the investment helps the most vulnerable in future generations, inequality aversion further decreases the discount rate. In addition, inequality aversion magnifies the effect of the correlation on discounting when future consumption is uncertain.
In the end, this paper provides reasons to think that the specific features of climate policies may justify evaluating them with a lower discount rate than other policies. Indeed, they protect the vulnerable, whose fate may be inversely correlated to that of the rich, and they pay more in states of the world in which damages hit the poorest. Further research is however needed to substantiate those intuitions. It would require a more precise description of the uncertainty (on consumption and the existence of future generations) as well as good scenarios describing the costs and benefits. Moreover, in order to assess climate policies, one may also go beyond the discount rate and evaluate the changes in the risks they induce, their non-marginal effects and their precise impact.
Another direction of research that we intend to pursue is to enrich the framework further so as to make it possible to discuss the measurement of individual well-being. In this paper the measurement of utility has been treated as exogenous. A more concrete description of the economic allocations would enable us to further specify the social evaluation criteria in relation to principles of fairness, and to provide more concrete indications for applications to the assessment of integrated scenarios describing the long-term evolution of the climate and the economy. In particular, the relative prices of different commodities (environmental goods vs consumption goods) change with time, yielding different discount rates (Gollier, 2010). It may be important to take into account the relative scarcity of some goods when evaluating the welfare of future generations.

Appendix Appendix A Proofs
Proof of Lemma 1. Take any N ∈N 3 . Define the orderingR N on X |N | as follows: By Axiom 1, the relationR N is transitive, reflexive, complete and continuous. By Axiom 3, the relation is monotonic. By Axiom 5, any subset of N is separable.
Therefore, as |N | ≥ 3, there exist continuous and increasing function φ i N such that i∈N φ i N (u(i)) representsR N . By Axiom 2 the representation must be symmetric and can be written i∈N φ N (u(i)). By the definition of the relation R N , we also obtain that, whenever N (u s ) = N (v s ) = N : Thus there must exist a continuous and increasing function Ψ N such that, for all u s such that N (u s ) = N , Note that Axiom 2 imposes the Anonymity requirement for subpopulations of the same size but that may differ. In particular, it implies that, whenever |N | = |M| and there is a bijection π : We can therefore take Ψ N = Ψ M = Ψ |N | and φ N = φ M = φ |N | .
Proof of Proposition 1. First, note that Lemma 1 applies.
Take any N ∈N 3 . For every u ∈ X |N | , define the equally distributed equivalent (EDE) of u as the scalar e(u) ∈ X such that (p, (u, ..., u) m )I(p, e(u)1 m,n ), where 1 m,n is the m × n unit matrix (all its components equal one). By Axiom 3 and the continuity of V, e(u) exists for every u ∈ X |N | . By the representation in Lemma 1, we obtain that By Axiom 1, the fact that p, (u, ..., u) I p, e(u)1 m,n implies that V (u) = V e(u)1 n , where n = n(u s ) and 1 n is the unit vector of R n . Therefore, for all u, v ∈ X N and all p, q ∈ Σ m−1 , Summarizing, for all u, v ∈ X N and all p, q ∈ Σ m−1 , is a vNM utility function for the society. Hence there exist a positive real number α |N | and a real number β |N | such that Consider the allocations u, v described in Axiom 8. Next consider w, z ∈ U such that: The prospect w is similar to u except that individuals get the EDE welfare level in state s. Therefore, we have V (w s ) = V (u s ). By Axiom 8, we also Using the representation of V and the expression for the EDE, we obtain: Because v k s = C (V (u s ) , n (u s )) = C (V (w s ) , n (w s )) = z k s , the equality reduces to: , this yields the functional equation The solution of this Pexider equation is F (x) = ax + b and G(x) = ax + b/|N | for some a > 0 and b ∈ R. Letting y = φ −1 |N | (x), the equation In other words, Reasoning by recurrence, it can be shown that, for all l ∈ N, there exist a positive real number a l and a real number b l such that φ l = a l φ + b l , for a continuous increasing function φ, so that: With this formula, we can compute the critical level C (V (u s ) , n(u s )) used to construct the prospect v from u in Axiom 8: Proof of Corollary 1. Let c n = C(x, n). By a simple change of variable, z = (x − β n )/α n , we obtain the functional equation: where z ∈ X. Letting a = αn α n+1 > 0, b = βn−β n+1 α n+1 , this equation reads The equation implies φ (ac n + b) = φ(c n ), so that ac n + b = c n .
One obtains: Note that it is impossible to have a = 1 because that would mean f (z) = n n+1 f (z) for all z.
One therefore has φ(z) = f (z − c n ) + φ(c n ) = Θ(z − c n ) |z − c n | ω + φ(c n ). The case z = c n requires that ω > 0. Therefore, the fact that φ is increasing implies If there is z ∈ X such that z > c n , the strict concavity of φ imposes ω < 1.
If there is z ∈ X such that z < c n , the strict concavity of φ imposes ω > 1.
Therefore, only two cases are possible: either for all z ∈ X, z ≥ c n , Θ(z − c n ) > 0 (except possibly at z = c n ), and c n = min X, or for all z ∈ X, z ≤ c n , Θ(z−c n ) < 0 (except possibly at z = c n ), and c n = max X.
Proof of Corollary 3. First note that Axiom 7 implies Axiom 5 so that Proposition 1 applies. Consider any specific N and M ⊂ N such that |M| ≥ 2.
Consider now any u, v,ũ,ṽ ∈ U N such that for all s ∈ {1, · · · , m}, u i s =ũ i s for all i ∈ M, for all s ∈ {1, · · · , m}, v i s =ṽ i s for all i ∈ M Assume furthermore and without loss of generality that in the representation given in Proposition 1 φ(u * ) = 0 and denote t = i∈N \M u(i). Axiom 7 implies that m s=1 Hence there exists an increasing function F such that: m We therefore end up with the following functional equation: The solution of this functional equation is (Aczél, 1966, Theorem 2, p.153): • either there exist A ∈ R ++ and B ∈ R such that φ −1 (z) = Az + B; • or there exist A ∈ R ++ and κ ∈ R\{0} , B ∈ R such that φ −1 (z) = A γ e γz +B.
The latter case yields that φ −1 1 Proof of Proposition 2. First note that Axiom 6 implies Axiom 5, so that Lemma 1 applies. Now take any N ∈N 3 . Consider a subpopulation C ⊂ N such that |N | > |C| > 1. Denote R = N \ C and r = |R|. Consider all u ∈ U N such that for all i ∈ C there exists u(i) ∈ R, u i = u(i)1 m . We denote this set U C N and consider the restriction of R to this set. We denote u R = ((u i ) i∈R ) By Axiom 6, the subset R ∪ A is separable for R for all A ⊂ C (including A = ∅). Therefore, by Theorem 1 in Gorman (1968), every subset of C, including C itself, is also separable. By Corollary of Theorem 1 in Gorman (1968), there exist continuous functions h N : X rm → R andφ i |N | : X → R, i ∈ C, such that for all u, v ∈ U C N , Therefore, there exists an increasing function f |N | such that for all u ∈ U C N , s∈{1,··· ,m} Fix a particular x 0 ∈ X. We can normalize φ |N | (x 0 ) = 0 and h |N | (x R 0 ) = 0, where x R 0 is the prospect u R such that u i s = x 0 for all i ∈ R and s ∈ {1, · · · , m}. Restricting attention to prospects u ∈ U C N such that u R = x R 0 , one obtains the functional equation:ḡ i∈C φ |N | (u i 1 ) = i∈C φ * i φ |N | (u i 1 ) .
Denote n = |N |. There are two cases: Case 1: Ψ n (z) = A n z + B n .
Reasoning recursively, one can write There is no harm in changing the constant, so that one can more elegantly write, for all u s such that n (u s ) ≥ 3, Case 2: Ψ n (z) = An γn exp(γ n z) + B n .
It must be the case that For this to be true, we need B n = B n+1 = B and γ n i∈N φ n (u i s ) + ln(A n / |γ n |) = γ n+1 i∈N φ n+1 (u i s ) + φ n+1 (c) + ln(A n+1 / |γ n+1 |) Using the same argument as before, it must be the case that γ n φ n (z) = γφ(z) + b n , γ n+1 φ n+1 (z) = γφ(z) + b n+1 , and ln(A n+1 / |γ n+1 |) + (n + 1) b n+1 = ln(A n / |γ n |) + nb n − γφ(c), which can be rewritten A n+1 γ n+1 exp ((n + 1) b n+1 ) = An γn exp (nb n ) exp (−γφ(c)). Reasoning by recursion down to n = 3, and up to a multiplicative constant in the first term, one can write Proof of Corollary 4. Consider u, v ∈ U N described in Axiom 9. By Proposition 2, we need to consider two cases: Case 1: Social preferences are represented by s∈{1,··· ,m} p s i∈N (us) (φ(u i s ) − φ(c)). Axiom 9 implies that, for u, v ∈ U N satisfying the premisses of the axiom, Hence, by the unicity of vNM representations up to an increasing affine transformation, there exist a ∈ R ++ and b ∈ R such that φ(z) = az + b. Hence, by the unicity of vNM representations up to an increasing affine transformation, it must be the case that there exist a ∈ R ++ and b ∈ R such that