Quantile Jensen’s inequalities

Quantiles of random variable are crucial quantities that give more delicate information about distribution than mean and median and so on. We establish Jensen’s inequality for q-quantile (q≥0.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$q\geq 0.5$\end{document}) of a random variable, which includes as a special case Merkle (Stat. Probab. Lett. 71(3):277–281, 2005) where Jensen’s inequality about median (i.e. q=0.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$q= 0.5$\end{document}) was given. We also refine this inequality in the case where q<0.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$q<0.5$\end{document}. An application to the confidence interval of parameters in pivotal quantity is also considered by virtue of the rigorous description on the relationship between quantiles and intervals that have required probability.


Introduction and preliminaries
Inequalities play a central role in all mathematical branches, especially in approximation theory, as illustrated by monographs such as Hardy et al. [2] and Kazarinoff [3]. The famous Jensen inequality is one of the most useful inequalities in probability and statistics, which applies to convex functions. A function f (x) defined on R is proved to be a convex function if f (λx + (1λ)y) ≤ λf (x) + (1λ)f (y) for all x and y, and 0 ≤ λ ≤ 1. In addition, a function f (x) is said to be a concave function if -f (x) is a convex function. When applied with expectation operator in probability, Jensen's inequality states that, for any real-valued random variable X with a finite expectation E|X| and for any convex function f , f (EX) ≤ Ef (X) holds. The equality holds if and only if, for every line a + bx that is related to be tangent f (x) at x = EX, P(f (X) = a + bX) = 1. When we apply Jensen's inequality to concave functions, we have a converse inequality, that is, if f is concave, then f (EX) ≥ Ef (X).
There are some interesting examples about Jensen's inequality. One immediate application of Jensen's inequality on f (x) = x 2 shows that EX 2 ≥ (EX) 2 for any real-valued random variables. This is also a consequence if one thinks of Var(X) = EX 2 -(EX) 2 ≥ 0. Also, it is obvious that 1/x is convex on (0, ∞). Accordingly, E(1/X) ≥ 1/EX, if X > 0 almost surely. Other applications of Jensen's inequality can be found almost in every textbook in the field of probability and statistics, for example, in proving an inequality between any two of the three different kinds of means (see Casella and Berger [4]), in proving the relationship between convergence in the rth mean and convergence in the sth mean with 0 < s < r (see Serfling [5], p. 7), etc. Jensen's inequality can also play a significant role in the fields of applied mathematics (see Mitrinović,Pečarić,and Fink [6]; Malamud [7]), information theory (see Dragomir [8]; Budimir et al. [9]), and pricing theory of financial derivatives (see Hull [10]). Different kinds of generalizations and variant of Jensen's inequalities can also be found, for example, in To and Yip [11], Rigler et al. [12], Agnew and Pecaric [13].
Another analogue of Jensen's inequality was given by Merkle [1], where median is in lieu of expectation operator; moreover, Merkle [14] generalized median Jensen's inequality to the multivariate case; Kwiecien and Gather [15] proved another analogous Jensen's inequality with the expectation operator replaced by the Tukey-median-operator.
As median is a special case of quantile with q = 1/2, we are very much curious whether Jensen's inequality still holds for general q-quantile. As far as we know, no relevant work is available in the literature. The present paper contributes in three folds: Firstly, we describe the relationship between quantiles and intervals that have corresponding probability when q ≥ 1/2; also we show that this relationship is violated when q < 1/3, and further when 1/3 ≤ q < 1/2 we illustrate with several examples that this relationship can be both possible and impossible.
Secondly, we show rigorously that Jensen's inequality with quantile operator and Cfunctions, which is more general than convex functions (defined in the paper), still holds for q-quantile (q ≥ 1/2) and any random variable X, while a refinement of the corresponding inequality is derived in the case that q < 1/2 though quantile Jensen's inequality does not hold in this situation. Thirdly, we establish a stringent description for confidence interval of parameters in any pivotal quantity by virtue of the relationship in the first contribution.
The remaining part of the paper is arranged as follows. Two crucial lemmas that are fundamental for our theoretical development in the following sections are presented in Sect. 2; Sect. 3 shows our main results including quantile Jensen's inequality when q ≥ 1/2 and its refinement when q < 1/2. Section 4 applies the assertion developed in Sect. 2 to construct a confidence interval for pivotal quantity. Some concluding remarks are included in Sect. 5.

Two crucial lemmas
Let X be a random variable defined on some probability space ( , F, P). By definition, a q-quantile (q ∈ [0, 1]) of X is any real number μ q that satisfies the inequalities As to any random variable X, its q-quantile either is unique or there are infinitely many of them such that all its q-quantiles are abound within a closed bounded interval [a, b]. Apparently, one significant feature of the quantile is that μ q is nondecreasing with q, that is, μ q ≤ μ q whenever q ≤ q .
In the sequel we refer to all intervals like (-∞, a] and [b, ∞) for any real number a and b as closed half lines.
Lemma 1 Let I be either a closed half line or a closed interval on R and q ≥ 1 2 . We have: (i) If P(X ∈ I) = q, then there exist both q-quantile and (1q)-quantile of X in I; additionally, for any η : 1q < η < q, I contains all η-quantiles of X. (ii) Suppose that the set S possesses the following property: If J is any closed interval which has S as a proper subset, that is, As a special case where q = 1 2 , the lemma is reduced to Lemma 1.2 of Merkle [1]. Accordingly, we shall dwell on the case q > 1 2 .
so that a is a q-quantile too, another contradiction. In conclusion, there at least exists a q-quantile in I.
Next, given P(X ∈ I) = q and q > 1 2 , in order to prove the existence of μ 1-q ∈ I, let us consider the three possibilities of I.
Additionally, because there are both (1q)-quantile and q-quantile in I, the assertion holds for 1q < η < q immediately due to the monotonicity of a quantile function.
(ii) Firstly, we prove that S contains all q-quantiles of X. Suppose that μ q is a q-quantile of X and μ q / ∈ S. It is evident that J is set in a closed interval such that J ⊃ S and μ q / ∈ J. Then J is disjoint with one of sets A = (-∞, μ q ] or B = [μ q , ∞). Thus, either P(X ∈ A) < 1-q < q or P(X ∈ B) < 1q, which implies that μ q is not a q-quantile of X. Therefore, all q-quantiles of X are in S. Next, in a similar fashion, we can show that (ii) holds for η = 1q. Finally, by the monotonicity of quantile function, we have that (ii) holds for 1q < η < q.
Since the assertions in Lemma 1 are based on the condition q ≥ 1 2 , we are curious what happens for the case where q < 1 2 . Intuitively, if P(X ∈ I) = q < 1 2 , then the set I is relatively "small" since the measure of the entire real line R by P(X ∈ ·) is one, whereas the complement of I is sufficiently large to contain possibly the quantiles. By contrast the set I such that P(X ∈ I) = q ≥ 1 2 is relatively "large", that is the key to validate the assertions in Lemma 1. Thus, we may not be able to conclude similar assertions as in Lemma 1. Nevertheless, we have the following lemma.

Lemma 2
Let I be any interval on R or half line and 0 < q < 1 3 . If P(X ∈ I) = q, then either μ q or μ 1-q is not in I.
Proof Since 1q > q, we have μ 1-q ≥ μ q . It follows from the definition that Thus, it is impossible that the interval [μ q , μ 1-q ] can be covered by I; thus at least one of μ q and μ 1-q is not in I, which finishes the proof.
This lemma shows that when q < 1 3 , Lemma 1 is no longer valid, a sufficient condition that hamstrings the preceding lemma. However, when 1 3 ≤ q < 1 2 , both positive and negative examples exist.

Quantile Jensen's inequality
Quantile Jensen's inequality is established below with C-functions. Recall that the Cfunctions are real-valued functions f defined on R such that, for any u ∈ R, the set is a closed interval, a singleton, or an empty set.
Note that it happens to be lower semicontinuous if f -1 ((-∞, u]) belongs to a closed interval for any u ∈ R. Therefore, it is clear that a C-function is lower semicontinuous, hence f (x) ≤ lim inf y→x f (y) for any x ∈ R. In addition, every C-function f (x) has finite left and right limits at any x, and f (x) ≤ min(f (x -), f (x + )).
On the other hand, if f (x) is convex on R, it definitely belongs to the class of C-functions; the same as any monotone and continuous function on R. Similarly, any continuous function with nonincreasing on (-∞, a) and nondecreasing on (a, ∞), for a fixed a, is deemed to be a C-function e.g. all loss functions used in statistics. More discussion can be found in Merkle [1].
Theorem 1 (Jensen's inequality for quantile) Let g be a C-function and X be any real random variable. Suppose that q ≥ 1 2 . Then, if μ X q , the q-quantile of X, is unique, then where μ g(X) q is any q-quantile of g(X). Conversely, if μ g(X) q is unique, (2) holds for any qquantile of X. General speaking, for any q-quantile of g(X), there exists a q-quantile of X such that (2) holds.
Proof Let μ g(X) q be one q-quantile of g(X), and define I = g -1 ((-∞, μ g(X) q ]). Then P(X ∈ I) = P g(X) ≤ μ g(X) q ≥ q, and by Lemma 1 there is one μ X q in I which implies g(μ q ) ≤ μ g(X) q . Therefore, it has been proved that for any q-quantile of g(X) there exists a q-quantile of X such that (2) holds. In particular, it is implied that if μ X q is unique, (2) holds for any q-quantile of g(X). It is supposed that μ g(X) q is unique. We only need to prove the property of interval I described in (ii) of Lemma 1. In fact, if otherwise, it can be found that there is a closed interval J under the condition such that I is a proper subset of J and P(X ∈ J) = q. Without loss of generality, we assume q )/2. It is easy to show that the closed interval I 1 = g -1 ((-∞, M]) is contained in J due to the lower semicontinuous property of g, which also means Consequently, M is a q-quantile of g(X), which is contradictory to the uniqueness of μ g(X) q . Therefore, we complete the proof of Theorem 1.
Remark 1 The theorem indicates that, when q ≥ 1 2 , the quantile operator μ X q , along with any C-function, possesses Jensen's inequality, which extremely extends the existing literature; unfortunately, this cannot be guaranteed when q < 1 2 as illustrated in the following remark.
Remark 2 Theorem 1 does not hold for the case q < 1 2 yet. For example, let g(x) = x 2 and X be a discrete random variable with Although Theorem 1 may not hold for the case q < 1 2 , it is interesting to find that we still can construct a similar inequality for q-quantile of X in this case.

Corollary 1 Let g be a C-function and X be any real random variable. Suppose that
where μ g(X) 1-q is any (1q)-quantile of g(X). If μ g(X) 1-q is unique, (3) holds for any q-quantile of X. General speaking, for any (1q)-quantile of g(X), a q-quantile of X such that (3) is available.
Proof By noting that μ X q = -μ -X 1-q and the fact that if g(x) is a C-function, h(x) = g(-x) is also a C-function, we have 1-q .

Remark 3 Assertion (3) can be reinforced as
1-q by virtue of Theorem 1 and 1q > 1/2. Furthermore, if g(·) happens to be an increasing function, one has g(μ X q ) ≤ g(μ X 1-q ) ≤ μ g(X) 1-q . To illustrate with a concrete example, let X be the discrete random variable defined in Remark 2. It is easy to check that, for all μ 0.25 , we have (μ 0.25 ) 2 ≤ μ X 2 0.75 .

Confidence intervals
In statistical inference, to make some standard inference for unknown parameters θ , it is important to construct a confidence interval for it. The first step of constructing a confidence interval is to find a pivotal quantity for θ . Given several random samples X 1 , . . . , X n with sample size n, an expression Y = Q(X 1 , . . . , X n ; θ ) is called pivotal quantity for θ if the distribution of Y does not depend on θ . For example, let X 1 , . . . , X n be a sample drawn from population X ∼ N(μ, σ 2 ),X and S 2 be the sample mean and sample variance. Then are pivotal quantities, respectively, for μ and σ 2 . For more discussion on pivotal quantity, please refer to Casella and Berger [4], p. 427). The second step is that, for a specified value of α ∈ (0, 1), we can find numbers a and b, which do not depend on θ but on α, to satisfy P θ a ≤ Q(X 1 , . . . , X n ; θ ) ≤ b ≥ 1α.
Obviously, according to the procedures listed above, the confidence interval is not unique because another interval on which the pivotal quantity has the probability as well can be found. However, on the basis of the relationship of the intervals and quantiles in Lemma 1, a more natural definition of confidence interval for the pivotal quantity given below is to be unique.
Theorem 2 (A characterization of 1α confidence interval for pivotal quantity Q(X; θ ) with α < 0.5, usually, α = 0.01, 0.05, 0.1) For any pivotal quantity Q(X; θ ), the set of its 1α confidence interval is the intersection of all closed intervals I that satisfy: If J is any closed interval that contains I as a proper subset of J, then P(J) > 1 -α 2 .
Proof Let I q denote the set of all q quantiles of Q(X; θ ) andĨ denote the intersection of all closed intervals I with a property stated in the theorem. We only need to show that Remark 4 The key point in the proof of Theorem 2 is to make use of the assertion in Lemma 1, where the condition q ≥ 1 2 is fulfilled automatically due to 1α/2 ≥ 1 2 for any α ∈ [0, 1]. Theorem 2 gives a characterization of the confidence interval for parameters in pivotal statistic. However, it is a challenge to extend the definition of confidence interval to higher dimensions due to its shape restriction.

Concluding remarks
The paper has shown two critical lemmas that help prove quantile Jensen's inequality and construct a confidence interval for parameters in some pivotal quantities. All these results, however, are about univariate random variables, and we thus shall study the relevant theory on multivariate variables in our future work.