On the influence function for the Theil-like class of inequality measures

On one hand, a large class of inequality measures, which includes the generalized entropy, the Atkinson, the Gini, etc., for example, has been introduced in Mergane and Lo (2013). On the other hand, the influence function of statistics is an important tool in the asymptotics of a nonparametric statistic. This function has been and is being determined and analysed in various aspects for a large number of statistics. We proceed to a unifying study of the IF of all the members of the so-called Theil-like family and regroup those IF's in one formula. Comparative studies become easier.


Introduction
Over the years, a number of measures of inequality have been developed. Examples include the generalized entropy, the Atkinson, the Gini, the quintile share ratio and the Zenga measures (see e.g. Cowell and Flachaire (2007); Cowell et al. (2009) ;Hulliger and Schoch (2009); Zenga (1984) 1 and Zenga (1990)). Recently, Mergane and Lo (2013) gathered a significant number of inequality measures under the name of Theil-like family. Such inequality measures are very important in capturing inequality in income distributions. They also have applications in many other branches of Science, e.g. in ecology (see e.g. Magurran (1991)), Sociology (see e.g. Allison (1978)), Demography (see e.g. White (1986)) and information science (see e.g. Rousseau (1993)).
In order to make the above mentioned measures applicable, one often makes use of estimation. Classical methods unfortunately rely heavily on assumptions which are not always met in practice. For example, when there are outliers in the data, classical methods often have very poor performance. The idea in robust Statistics is to develop estimators that are not unduly affected by small departures from model assumptions, and so, in order to measure the sensitivity of estimators to outliers, the influence function (IF) was introduced (see Hampel (1974), Hampel et al. (1986)).
Let us begin by precising the objects and notation of our study, in particular the influence function. To make the reading of what follows easier, we suppose that we have a probability space (Ω, A, E) holding a random variable X associated with the cumulative distribution function (cdf ) F (x) = P(X ≤ x), x ∈ R, and a sequence of independent copies of X: X 1 , X 2 , etc. This random variable is considered as an income variable so that it is non-negative and F (0) = 0. The absolute density distribution function (with respect to the Lebesgue measure on R) of X (pdf ), if it exists, is denoted by f . Its mean, we suppose finite and non-zero, and moments of order α ≥ 1 are denoted by The quantile function associated to F , also called generalized inverse function is defined by and the Lorentz curve of F is given by A nonparametric estimation T (F ) will studied as well as its plug-in nonparametric estimator of the form T (F n ) which is based on the sample X 1 ,..., X n , n ≥ 1.
The influence function IF (•, T (F )) of T (F ) is the Gateaux derivative of T at F in the direction of Dirac measures in the form ∆ z is the cdf of the δ z , the Dirac measure with mass one at z and z is in the value domain of F .
It is known that the asymptotic variance of the plug-in estimator T (F n ) of statistic T (F ) is of the form σ 2 = IF (x, T (F )) 2 dF (x) under specific condition, among them the Hadamard differentiability (see Wasserman (2000), Theorem 2.27, page 19). So the influence function gives an idea of what might be the variance of the Gaussian limit of the estimator if it exists. At the same time, the behavior of its tails (lower and upper) give indications on how lower extreme and/or upper extreme values impact on the quality of the estimation. For example, recently, the sensitivity of a statistic T (F ) and the impact of extreme observations of some influence functions have been studied by, e.g., Cowell and Flachaire (2007).
Another interesting fact is that the influence function behaves in nonparametric estimation as the score function does in the parametric setting (see Wasserman (2000), page 19).
An area of application of the influence function is that of measures of inequality (see, e.g., Van Praag et al. (1983), Victoria-Feser (2000) and Kpanzou (2015)). Due to the importance of that key element in nonparametric estimation in Econometric and welfare studies, a collection of inequality measures is being actively made. To cite a few, the IF 's of the following measures are given in the Appendix section: the generalized entropy class of measures of inequality GE(α), where α > 0, the mean logarithmic deviation (MDL), the Theil Measure, the Atkinson Class of Inequality Measures of parameter α ∈ (0, 1], the Gini Coefficient, the Quintile Share Ratio Measure of Inequality (QSR).
The inequality measures mentioned above are derived from (1.2) with the particular values of α, τ, h, h 1 and h 2 as described below for all s > 0 : (a) Generalized Entropy This is simply the plug-in estimator of The following conditions are required for the asymptotic theory.

B1
The functions τ admits a derivative τ ′ which is continuous at I and This offers an opportunity to present a significant number of IF 's in a unified approach. This may be an asset for inequality measures comparison. By the way, it constitutes the main goal of this paper. Let us add more notation. The lower endpoint and upper endpoint of cdf F are denoted by So the domain of admissible values for X, denoted by V X , satisfies V X ⊂ R X = [lep(F ), uep(F )], the latter being the range of F . The layout of this paper is as follows. In the next section we state our main result on the influence function of the TLIM family members and some particularized forms related to each known members. For member whose IF 's are already given, we will make a comparison. In Section 3, we give the complete proofs. In Section 4 we provide a conclusion and some perspectives. Section 5 is an appendix gathering IF 's expressions of some members of the TLIM available in the literature.

Main results
(A) -The main theorem. Theorem 1. If conditions (B1) − (B2) hold, then the Influence function of the TLIM index is given by Remark on the asymptotic variance. It was said earlier that the plug-in estimator should give the asymptotic variance of the limiting Gaussian variable, if it exists, as This is exactly the case from the asymptotic normality of the plug-in estimator as established in Theorem 2 in Mergane et al. (2018).
Let us move to the illustrations of our results for particular cases.

(B) -Particular forms.
Let us proceed to the study of particular members of the TLIM class. We will have to compare our results with existent ones if any in the appendix. When the computation are simple, we only give the result without further details.

Proof of the main theorem
In the following proof, we will use the method of finding the IF following argument as given in Kahn (2015). Suppose that we are interested in estimating T (P X ), where P X the image measure is dP defined by dP X (B) = dP(X ∈ B) for B ∈ B(R) and is also Lebesgue-Stieljes probability law associated F , that is P Here we use integrals based on measures and thus integrals in dF are integrals in dP X in the following sense: for any non-negative and measurable function ℓ : R → R, we have ℓ(X)dP = inf h(y)dP X ≡ h(y)dF (y).
Suppose that T (P) is defined on a family of probability measures P λ , P λ being associated with the random variable X λ with X = X λ 0 and F = F λ 0 . Suppose that T is independent of λ. If we have where ℓ is measurable and P X -integrable. Then the IF at T (F λ 0 ) = T (F ) is given by Actually, the rule uses Gâteaux differentiations properties and constitutes one of the fastest methods of finding the IF. We are going to apply it.
Proof of Theorem 1.
We remind the notation.
We have We get By centering at expectations, we have

Conclusion and Perspectives
I this paper, we studied the Theil-like family of inequality measures introduced in Mergane et al. (2018). Following the paper on the asymptotic finite-distribution normality, we focus on the influence function of that family. Results are compared with those of some authors in particular. We think that this unified and compact approach will serve as general tools for comparison purpose. In addition, in computation packages, it allows more compact programs resulting in more efficiency. A paper on computational aspects will follow soon.

Appendix: A list of some Influence Functions
Here, we list a number of inequality measures and the corresponding influence functions.
The Generalized Entropy Measures of Inequality GE(α), which depends of a parameter α > 0 and defined by has the IF (see e.g. Cowell and Flachaire (2007)) (5.1) Important remark. Our result on the IF of the GE(α) is different from that of Cowell and Flachaire (2007) by the multiplicative coefficient 1 α(α−1)µ α F . In other words, that coefficient is missing in Cowell and Flachaire (2007). We also find the same result by the computations below which is a direct proof.
By the method described in the proof, we may center the integrand to get which again gives the result.
The Mean Logarithmic Deviation (MDL), which is a special case of the GE class where α = 0, defined by The Theil Measure, which also is a special case of the GE class for α = 1, The Atkinson Class of Inequality Measures of parameter α ∈ (0, 1], defined by (see Cowell and Flachaire (2007)) and its influence function is given by where ν = EX 1−ǫ .
We notice that for α = 1, we have where 1 A is an indicator function of a set A, is associated with the IF described below (see Kpanzou (2015)). Let