A new refinement of Jensen’s inequality with applications in information theory

Abstract In this paper, we present a new refinement of Jensen’s inequality with applications in information theory. The refinement of Jensen’s inequality is obtained based on the general functional in the work of Popescu et al. As the applications in information theory, we provide new tighter bounds for Shannon’s entropy and some f-divergences.


Introduction
Let C be a convex subset of the linear space X and f a convex function on C. holds [1]. If f is concave, then the preceding inequality is reversed. Jensen's inequality probably plays a crucial role in the theory of mathematical inequalities. It is applied widely in mathematics, statistics, and information theory and can deduce many important inequalities such as arithmetic-geometric mean inequality, Hölder inequality, Minkowski inequality, and Ky Fan's inequality.
In 2010, Dragomir obtained a refinement of Jensen's inequality as follows [2]: The same year, Dragomir has also obtained a different refinement of Jensen's inequality as follows [3]: In [5], Horváth developed a general method to refine the discrete Jensen's inequality in the convex and mid-convex cases. The main part of the inequalities in Theorems 1.2 and 1.3 are special cases of Theorem 1 in the paper. Recently, Horváth et al. [6] presented new upper bounds for the Shannon entropy (see Corollary 1) and defined an extended f-divergence functional (see Definition 2) by applying a cyclic refinement of Jensen's inequality. For more other refinements and applications related to Jensen's inequality, see [7][8][9][10][11][12][13][14][15][16][17].
The main aim of this paper is to extend the results of Dragomir [3] and Popescu et al. [4] by the aforementioned functional. In Section 2, we give refinement of Jensen's inequality associated with the general functionals. The refinement demonstrates some estimates of Jensen's gap and tightens the inequalities (4). In Section 3, we show the applications in information theory. We propose and prove new tighter upper bounds for Shannon's entropy compared to the bound given in [4]. At last, we obtain new bounds for some f-divergences better than the bounds given in [3].

General inequalities by generalization
We continue to use the aforementioned definition and show the main results.
The aforementioned inequality can be rewritten as: Proof. We assume the value of ( … ) Then we let two nonempty subsets . Then the main results above are given as follows: Theorem 2.3. Let C be a convex subset in the real linear space X and assume that → f C : is a convex function on C. Assume further that n n m m Proof. Since the first inequality and the last inequality follow from Theorem 1.3, we can suppose that ≥ n 4, and we need only to prove that , Jensen's inequality can be applied, and we obtained from the aforementioned equality , , , holds. Since each partition from + m 1 is a refinement of a partition from m , the result follows. The proof is complete. □

New upper bounds for Shannon's entropy
As the consistent work, bounds for Shannon's entropy [18] can be found in [4,8,10,15]. For further discussion, we present the definition of Shannon's entropy first. If the discrete probability distribution n is given by Furthermore, considering the aforementioned results the following tighter bounds for Shannon's entropy are presented.
Proof. Taking into consideration the inequalities of Theorem 2.1 applied for the convex function Proof. Taking into consideration the inequalities of Theorem 2.3, we have the inequalities (11) by the similar method above. □

New lower bounds for f-divergence measures
Given a convex function are positive sequences, was introduced by Csiszár in [19], as a generalized measure of information, a "distance function" on the set of probability distributions n . As in [19], we interpret undefined expressions by The following results were essentially given by Csiszár and Körner In particular, if ∈ p q , n , then (14) holds. This is the well-known nonnegative property of the f-divergence. Dragomir gives the concept for functions defined on a cone in a linear space as follows [3]: In the first place, we recall that the subset K in a linear space X is a cone if the following two conditions are satisfied: (i) for any ∈ x y K , we have + ∈ x y K; (ii) for any ∈ x K and any ≥ α 0 we have ∈ αx K. For a given n-tuple of vectors = ( … ) ∈ z z K z , , n n 1 and a probability distribution ∈ q n with all values nonzero, we can define, for the convex function In the scalar case and if = ∈ x p n , a sufficient condition for the positivity of the f-divergence ( ) The case of functions of a real variable that is meaningful for applications is involved in the following: In what follows, we provide some lower bounds for a number of f-divergences that are used in various fields of information theory, probability theory and statistics.
The total variation distance is defined by the convex function ( ) = | − | ∈ f t t t 1 , and given by which proves the last part of inequalities (23). □ The Kullback-Leibler divergence [22] can be obtained for the convex function ( ) = > f t t t t ln , 0 and given by

Conclusion
The classical Jensen's inequality plays a very important role in both theory and applications. In this paper, we have obtained some refinements of Jensen's inequality (5)- (8) in real linear space using the generalized Popescu et al. functional. Moreover, we have obtained the new and sharp bounds of Shannon's entropy and several f-divergence measures in information theory. In the future work, we will continue to explore other applications on the inequalities newly obtained in Section 2.