Intrinsic posterior regret gamma-minimax estimation for the exponential family of distributions

In practice, it is desired to have estimates that are invariant under reparameterization. The invariance property of the estimators helps to formulate a unified solution to the underlying estimation problem. In robust Bayesian analysis, a frequent criticism is that the optimal estimators are not invariant under smooth reparameterizations. This paper considers the problem of posterior regret gamma-minimax (PRGM) estimation of the natural parameter of the exponential family of distributions under intrinsic loss functions. We show that under the class of Jeffrey's Conjugate Prior (JCP) distributions, PRGM estimators are invariant to smooth one-to-one reparameterizations. We apply our results to several distributions and different classes of JCP, as well as the usual conjugate prior distributions. We observe that, in many cases, invariant PRGM estimators in the class of JCP distributions can be obtained by some modifications of PRGM estimators in the usual class of conjugate priors. Moreover, when the class of priors are convex or dependant on a hyper-parameter belonging to a connected set, we show that the PRGM estimator under the intrinsic loss function could be Bayes with respect to a prior distribution in the original prior class. Theoretical results are supplemented with several examples and illustrations.


Introduction
Suppose x is a realization of a random sample X with a sampling model given by a family of densities {f (·|θ) : θ ∈ Θ} with respect to a σ-finite measure ν on a sample space χ where θ is the unknown parameter of interest with θ ∈ Θ. Let π(·) be a prior distribution on Θ and π(·|x) denote the posterior distribution of θ given x. In standard Bayesian analysis, one needs to specify the true prior distribution π(·). However, in practice, elicitation of the true prior distribution can never be done without error. Hence, we usually need to consider a class Γ of prior distributions which reflect (approximately) true prior beliefs, i.e., the true prior distribution π(·) is an unknown element of Γ. Robust Bayesian analysis is designed to acknowledge such a prior uncertainty by considering the class Γ of plausible prior distributions instead of a single prior distribution π and studying the corresponding range of Bayesian solutions. See Berger (1994) and Rios Insua and Ruggeri (2000) for more details. One may also attempt to determine an optimal estimator δ by minimizing some measures of robustness. Several criteria have been proposed for the selection of procedures in robust Bayesian studies. In this paper, we study the maximal posterior regret method (e.g., Rios Insua and Ruggeri, 2000;Rios Insua et al., 1995) to obtain the posterior regret gamma-minimax (PRGM) estimator of the unknown parameter for the one-parameter exponential family of distributions. The PRGM criterion has been used recently 1 Corresponding author: m−jafari−jozani@umanitoba.ca by many people from both theoretical and practical points of view. For example, Gómez-Déniz (2009) investigated the use of PRGM for credibility premium estimation in Actuarial Science, Boratyńska (2002,2006) in insurance for collective risk model analysis, and Jafari Jozani and Parsian (2008) in statistical inference based on record data.
For an observed value x, a prior distribution π and the corresponding posterior distribution π(·|x), we denote the posterior risk of an estimate δ(x) of the unknown parameter θ under L(θ, δ) by r(x, δ) = E[L(θ, δ(x))|x]. The Bayes estimator of θ under the loss function L(θ, δ) is then given by a δ π (X) such that r(x, δ π ) = inf δ r(x, δ).
In this paper, we study the construction of PRGM estimators under the so-called intrinsic loss functions. These loss functions shift attention from the distance between the estimator δ and the true parameter value θ, to the more relevant distance between statistical models they label. More specifically, the intrinsic loss of using δ as a proxy for θ is the intrinsic distance between the true model f (x|θ) and the model f (x|δ) when θ = δ, that is where d(·, ·) is a suitable distance measure. In practice, intrinsic loss functions could be used as benchmark losses when the utility function related to the underlying statistical problem cannot be obtained by practitioners. A desired property of intrinsic loss functions is that they are invariant under one-to-one smooth reparameterizations. The invariance property of intrinsic loss functions provides a very convenient tool for statistical application. We show that, under suitable conditions, intrinsic loss functions could be used to formulate a unified set of solutions to the problem of PRGM estimation of the unknown parameter of the exponential family of distributions which is consistent under reparameterization, a rather obvious requirement, which unfortunately many statistical methods fail to satisfy.
In Section 2, we obtain the PRGM estimator of the natural parameter θ of the exponential family of distributions under the intrinsic loss function (2)

PRGM estimation under intrinsic loss functions
Suppose X is a random variable, where its distribution belongs to the one-parameter exponential family where r(x) > 0, β(θ)t(x) > 0 and θ is the unknown real-valued natural parameter of the model. The density is considered with respect to the Lebesgue measure for continuous and the counting measure for discrete distributions. Suppose δ is an estimate of θ with both θ, δ ∈ Θ. We define the intrinsic loss function (2), using the Kullback-Leibler measure between f (x|θ) and f (x|δ), as follows Loss function (4) can be interpreted as the expected log-likelihood ratio in favour of the true model.
Thus, the intrinsic loss function (4) not only has the desired invariance property but it is also related to the relevant measure of evidence in the Neyman-Pearson Lemma. Note that the intrinsic loss function (4) is invariant under reparameterization since the parameters affect the loss function only via the probability distributions they label, which are independent of the particular parameterization. For a general reference on intrinsic losses and additional details we refer to Robert (1996) andBernardo (2011).
First, we give a lemma which identifies the intrinsic loss function for the exponential family of distributions.
Let H(t) := β ′ (t)/β(t). A straightforward calculation shows that the posterior risk associated with δ, under the loss function (5), is The Bayes estimator of θ can therefore be obtained by minimizing (6) in δ as follows Following the decreasing monotone likelihood ratio property of the densities f (x|θ) in (3) in r(X), and since E[r(X)] = H(θ), H(·) is a decreasing function. Therefore, the Bayes estimator δ π (X) is unique. Furthermore, the posterior regret for estimating θ using δ instead of the optimal estimator δ π is obtained by Note that ρ(δ π , δ), as a function of δ π , decreases then increases with a unique minimum at δ π = δ. The main result of this section is given in the following theorem which obtains the PRGM estimator of θ under the intrinsic loss function (5).
Example 2 (Exponential distribution). Suppose X ∼ Exp(σ) is an exponential random variable with pdf f (x|σ) = 1 σ e −x/σ , x > 0, where σ > 0 is the unknown parameter. The pdf f (x|σ) belongs to the exponential family (3) with θ = 1 σ , and β(θ) = θ. In this case, H(θ) = θ −1 , and the intrinsic loss function (5) reduces to the Stein loss Using (9), subject to the existence of δ and δ, the PRGM estimator of θ under the Stein loss function is given by The PRGM estimator of σ is also obtained in Example 5.
where n is known, x = 0, 1, . . . , n, and p ∈ [0, 1] is the unknown parameter. The pmf f (x|p) is a member of the exponential family (3) with We also have H(θ) = n 1+e θ which results in the intrinsic loss function Using (9), subject to the existence of δ and δ, the PRGM estimator of θ is given by . (11) In Example 7, we obtain the PRGM estimator of p.
The PRGM estimate of θ for the exponential family (3) under the intrinsic loss function (5) and in the class Γ of prior distributions is given by Remark 1 One can also consider other classes of conjugate priors such as The PRGM estimator of θ in Γ 1 or Γ 2 can be obtained using (14) and by letting λ 1 = λ 2 = λ 0 or α 1 = α 2 = α 0 , respectively.

Intrinsic PRGM estimation
In Section 2, we obtained the PRGM estimator of the natural parameter θ of the exponential family under the intrinsic loss function. In some applications, there may be interest in finding PRGM estimation of the original parameter of the underlying model rather than the natural parameter θ. Unfortunately, like many other methods, PRGM estimators are not necessarily invariant under reparameterization.
Although results of this nature, that are not invariant under reparameterization, can sometimes be interesting in theory, they tend to be less useful in practice. Indeed, it is difficult to sell to a practitioner that the PRGM estimator of h(θ) is not necessarily h(δ P R ). In this section, we obtain PRGM estimators that are invariant under one-to-one smooth reparameterizations, hence the name intrinsic PRGM estimators.
First, we give the following result.
Lemma 3 Suppose δ J π is the Bayes estimator of the natural parameter θ of the exponential family (3) under the intrinsic loss function (4) with respect to the JCP distribution (15). For every one-to-one smooth transformation h(θ), the Bayes estimator of h(θ) is h(δ J π ).
Proof: The proof is similar to the proof of Lemma 6.2 of Robert (1996) and hence omitted. Theorem 2 Suppose δ Γ J IP R (X) is the PRGM estimator of the unknown parameter θ for the exponential family (3) under the intrinsic loss function (5) with respect to a class Γ J of JCP distributions for θ.
Then, for any one-to-one smooth transformation h(θ), the PRGM estimator of h(θ) is h(δ Γ J IP R (X)).
For the PRGM estimation of θ under the Entropy loss function and its application to record data analysis we refer to Jafari Jozani and Parsian (2008). Similarly, if η * = − 1 a log θ, α = 0, then the intrinsic PRGM estimator of η * under the LINEX loss function is given by which is the PRGM estimator obtained in Boratyńska (2006).
For the exponential family (3), suppose that the prior distribution belongs to the following class of JCP distributions: for suitable choices of α 1 < α 2 and λ 1 < λ 2 . We continue with some applications of Theorem 2 under the above class of priors. Similar results can be obtained in other classes of JCP distributions (see Remark 1), which we do not present here. In view of Theorem 2, and to obtain an intrinsic PRGM estimator, the critical condition is that the elements of the underlying class of prior distributions are in the form of (15) and the underlying loss function is intrinsic. We observe that, in many cases (see Examples 6 and 7) intrinsic PRGM estimators under Γ J can be obtained using the PRGM estimators under the usual class Γ of conjugate priors with modified values of α i s and λ i s in Γ, i = 1, 2. One can easily check that this will happen whenever the mean-value parameter is conjugate for the natural parameter in the sense of Gutierrez-Pena and Smith (1995). In the one-parameter case, a sufficient condition for this is that the exponential family have a quadratic variance function (see Section 3.3 of Gutierrez-Pena and Smith (1995)).

PRGM, Intrinsic PRGM and Bayes estimators
In this section, we provide some general results concerning the Bayesianity of the PRGM and intrinsic PRGM estimators of θ for the exponential family distribution (3) under the intrinsic loss function (5) with respect to priors in the underlying class of prior distributions. The results are only presented for PRGM estimators of θ, but they can also be used for intrinsic PRGM estimators by simple modifications. Our framework in this section closely resembles the one introduced by Rios Insua et al. (1995), who considered similar problem for the quadratic loss function. Results of this nature are also obtained by Zen and DasGupta (1993) under the quadratic loss function for the binomial distribution. Several of the following preliminary results and detailed proofs are reported here for the sake of completeness.
The idea is to check the continuity of the underlying Bayes estimator with respect to the prior. Similar to Rios Insua et al. (1995) we study two cases, when (a) the class of prior distributions is convex, or (b) the underlying class of prior distributions depends on a hyper-parameter belonging to a connected set.
First, consider the situation where the class Γ of priors is convex. That is, if π 0 , π 1 ∈ Γ, then π t = tπ 0 + (1 − t)π 1 belongs to Γ, for any t ∈ [0, 1]. Suppose that X is a random variable whose density belongs to the family of distributions (3). Let ψ(t) = H(δ πt (x)) which is a decreasing function of δ πt for any t ∈ [0, 1]. In the next lemma we show that ψ(t) is a continuous function in its domain t ∈ [0, 1].
Proof. Let a i = Θ H(θ)π i (θ)f (x|θ)dθ, m i (x) = Θ π i (θ)f (x|θ)dθ for i = 1, 2, and suppose that a i s and m i s exist and are finite. Then which is a continuous function of t, t ∈ [0, 1]. ✷ Now, we use the continuity of ψ(t) to prove that, under the conditions of Lemma 4, the PRGM estimator δ P R is Bayes if the class of priors is convex.
Theorem 3 Suppose Γ is a convex class of prior distributions on the unknown parameter θ of the exponential family of distributions (3). Then, there exists a prior distribution π ∈ Γ such that δ P R = δ π , where δ P R is defined in (9).
Lemma 5 Suppose that Θ H(θ) f (x|θ)dθ exists and it is finite. Then, ψ(π) is continuous in π, in the topology generated by the l ∞ distance.
In the following lemma, we provide a sufficient condition under which the PRGM (or intrinsic PRGM) estimator is Bayes with respect to the same prior in the underlying class of prior distribution, regardless of the observed value of x.
Lemma 6 Let Γ = {π α : α ∈ [α 1 , α 2 ]} be the class of prior distributions. Suppose the Bayes estimator Ψ(α, x) = H −1 {E[H(θ)|x)} is a differentiable function of the hyper-parameter α and the observed value x. Assume that we are under the conditions of Theorem 4. If has a constant solution in α, then there is a data independent prior π α ∈ Γ resulting in the PRGM estimate as the Bayes estimate of the natural parameter θ of the exponential family (3) under the intrinsic loss function (4).

Now, differentiating the equation with respect to x leads to
If α(x) is data independent, i.e., α(x) = α, then dα(x) dx = 0. Now, the desired value for α is the constant solution to the equation (18) leading to a data independent prior for the PRGM estimator to be Bayes.