Convergence of income distributions: Total and inequality-affecting changes in the EU

By adapting the statistical framework suggested by Székely and Rizzo (2004) and considering the convergence of income distributions instead of aggregate (e


Introduction
In this paper, we consider the convergence of income distributions by directly comparing their yearly changes derived from individual-level sample data, rather than inferring about convergence from some stochastic process of aggregate summary measures like average or median income. 1 The yearly measurement allows us to observe the dynamics of the convergence, (2007), Young et al. (2008), Durlauf et al. (2009), and Johnson and Papageorgiou (2020); the recent examples of application to the EU are Alcidi et al. (2018), and Cabral and Castellanos-Sosa (2019); Ravallion (2003), Bleaney and Nishiyama (2003), and Chambers and Dhongde (2016) are examples of the application of such methodology to the analysis of convergence of income inequality instead of aggregate income.whereas considering the whole distribution opens up the possibility of evaluating not only the convergence of income but also convergence of income inequality within the same framework.
One of the critical properties of proper metrics of income inequality is their scale-independence or, in other words, their invariance to a common rescaling of everyone's income.Therefore, changes in properly rescaled income might be informative about convergence in terms of inequality.One might be tempted to use the difference in (or ratio of) averages or medians (as, e.g., Handcock et al., 1996, as well as Handcock andMorris, 1998, andHandcock andMorris, 1999) for such purpose, which is however an arbitrary choice.We propose to use a more general rescaling based on the minimization of the distance between the (quantile functions of) analysed income distributions and the pooled income distribution.
To evaluate the multi-country convergence (divergence) of income distributions, we adapt Székely and Rizzo's (2004) approach of statistical testing for equal distributions.To be able to use the statistical test of convergence of distributions for inference about income convergence, we not only apply it yearwise but also change the perspective of how the test statistic is exploited.From the statistical point of view, convergence to the limiting distribution under the null hypothesis of equal distributions takes place as the number of observations (n) increases to infinity, and diverges under the alternative.On the other hand, if we keep n fixed at some positive integer, 2 then all changes observed in the expectation of the test statistic would stem from the variation in the underlying data generating process (DGP): This is what underlies the (economic) evaluation of convergence and divergence of income over years.Using the same number of observations 3 that we achieve through independent and random sampling of realizations from (consistent estimates of) income distributions also avoids the incomparability problem related to a varying population/sample size over years and among countries.
We illustrate the functionality of the proposed methodology with Monte Carlo (MC) simulations and use it to characterize the EU-wide income convergence during 2007-2014.Here, the underlying distributions of real annual equivalized net income are derived separately for each country and year from the harmonized European Union Statistics on Income and Living Conditions (EU-SILC) survey database. 4

Multi-country convergence: Evaluation and testing
Consider Székely and Rizzo's (2004) approach to testing for equal distributions (possibly in a high dimension) using the generic notation that, for a while, ignores the potential time- where for any pair (2) Eq. ( 2) satisfies the triangle inequality, and therefore, ξ n = 0 if all elements in the samples coincide, and ξ n > 0 otherwise.Székely and Rizzo (2004) show that, under the null hypothesis, ξ n has a well-defined limiting distribution, whereas under the composite alternative, Székely and Rizzo (2004) suggest bootstrapping from the pooled sample to derive the critical values.
2 It cannot be very small in order to have empirical power against the null hypothesis.
3 It can be also increasing, just uniformly across all years to retain the discussed interpretation.4 Further details on the employed data are provided in Cseres-Gergely and Kvedaras (2019).5 The usual cumulative distribution function (CDF) of a univariate real-valued continuous random variable Y i will be used, as denoted by where P is the probability measure.We will assume that the underlying CDFs are absolutely continuous, monotonically strictly increasing, and twice differentiable.
Let κ (b) 1−α stand for the α-size critical value from the bootstrap 1 would reject H 0 .We use the latter, as it avoids the scale dependence on κ (b) 1−α .Separately for each year, the empirical analogue of E(τ 1−α ), denoted by τ1−α , is obtained from the following simulations indexed by s ∈ {1, . . ., S}.First, after the consistent non-parametric estimation of the distribution functions with their inverse (the quantile function), we draw random samples of size n 0 ∈ N for each country7 using the procedure suggested by Hutson and Ernst (2000): first generating independent random samples from the uniform distribution on [0,1] and then mapping them through a quantile function to the respective independent and identically distributed samples of the variable of interest (income).Then, after implementing the earlier described bootstrap procedure that delivers the critical value κ (b)   1−α , we use the obtained repetitive samples to estimate E[τ 1−α ] by averaging simulation realizations τ (s) 1−α over the outcomes of repeated samples (thus yielding τ1−α ).Under H 0 , the probability8 (3) Therefore, in cases where the equality of income distributions under evaluation cannot be rejected, the distribution of ξ (s) n /κ (b)   1−α , connected with the generated samples as described above, should approximately satisfy condition (3) for not too small values of n (as well as a sufficiently large number of bootstrap replications and MC sampling iterations), and the 1 − α quantile of ξ (s) are driven by the underlying DGP.Hence, decreasing (increasing) would point to the presence of convergence (divergence) in economic terms.Since the discussed statistic is not directly observable, we will use the respective estimate τ1−α with its confidence bands (i.e., of the average).We shall further use the notion of full convergence to characterize the state of insignificant difference between the distributions, and not the potential shift in their difference.Such full convergence will be checked by evaluating the coincidence of the 1 − α quantile of ξ (s) n /κ (b)   1−α with 1.The income convergence evaluation above looks at the total convergence using the original {F i } k i=1 without any potential adjustment for the scale differences of income.To obtain the scale-independent evaluation of convergence using the above procedure, one just needs to replace {F i } k i=1 with { Fi } k i=1 , where Fi denotes a CDF of the rescaled income ỹi,r = bi y i,r , where y i,r , r ∈ {1 : N i } denotes the original income observations underlying F i in country i with its sample size N i , and where F stands for some reference CDF, and in particular, we will use the pooled income CDF derived from observations of all countries (in a fixed year). 9 of the test statistic under the null hypothesis, and this could be tested further.Due to computational intensity, in the empirical application, we restrict our attention to only one size level, α = 0.05, by evaluating if the 95% quantile of τ (s) coincides with 1. 9 In the empirical application, we will use the L 1 norm.Notice that a usual requirement for an income inequality metric is its scale-independence, so multiplication of all persons' income by a positive constant would not change the inequality level.Hence, if no (significant) difference remained between the distributions after the described rescaling based on the scaling constants from Eq. ( 4), inequality of income, as measured by any scale-independent metric, would be the same.Furthermore, the above minimization of the differences between quantile functions 10 implies that the difference remaining after such a transformation cannot be removed without affecting the value of the scale-independent inequality metric of income.Therefore, using { Fi } k i=1 in the earlier convergence evaluation procedure is informative about the inequality-affecting income convergence that is independent from the scale differences of {F i } k i=1 .
To finalize, we point out that the notation above was generic, whereas such analysis is performed separately for each year with all quantities being time-variant.

Monte Carlo illustration of convergence evaluation
For simulations, we set up a DGP that resembles the empirical data: The estimated log-normal EU-wide income distribution (F L ) is the basis, with the mean and standard deviation being 9.5 and 0.7, respectively.In the DGP, for all countries but the deviant, each year's income is given by random realizations from F L , i.e., y i,t ∼ F L (9.5, 0.7) for all periods indexed by t.All countries that have the same income distribution are indexed by i ̸ = i * , where i * is reserved for a country that has a potentially different distribution of income.
Let q 0.5 denote the median of F L .For the deviant country indexed by i * , we consider two types of deviations from F L : additive (DGP-A) and multiplicative (DGP-M).

DGP-M: y
In DGP-A, during the first five years,11 the absolute deviation, given by the non-zero second term on the top-right part of Eq. ( 5), is decreasing linearly and proportionally to the median of F L (from 0.5 to 0.1 of q 0.5 ), while during the last three years, the distribution of y (a) i * ,t is the same as for the rest of the countries.In DGP-M, during the first five years, the multiplicative factor is decreasing proportionally to values of F L (from 1.5 in the first year to 1.1 in the fifth).As in the case of DGP-A, during the last three years, the distribution of y (m) i * ,t remains the same as for the rest of the countries.
Fig. 1 plots the (empirical) CDFs of the simulated data, restricting the plotting range for better visibility.Both the (decreasing) additive and the multiplicative deviations are observable.
Fig. 2 plots the results of the implemented convergence evaluation and testing procedure.Black dots represent the evaluation statistic τ0.95 with its 95% confidence bounds around it in grey.
The upper dashes in blue stand for the 95% quantile of the simulated realizations τ (s) 0.95 (see the discussion below Eq. ( 3)), which allows us to infer about the acceptability of the null hypothesis of equal income distributions.The top and bottom panels of the figure correspond to DGP-A and DGP-M, respectively, whereas the left and right panels represent the testing without and with the rescaling adjustment, correspondingly.
In both the additive and the multiplicative cases, there is a clearly identifiable pattern of convergence and stabilization after the fifth year observed with non-rescaled data (see Fig. 2(a) and (c)), just as was embedded into the underlying DGPs with a statistically significant reduction in τ0.95 values.As far as one can judge from these few realizations, the limiting condition in Eq. ( 3) also seems to work well enough, with the 95% quantile of τ (s) 0.95 (blue dashes) being very close to one.
The results of the evaluation using the scale-adjusted data seem to be also quite reasonable (see the right panel of Fig. 2).As expected, in DGP-M with the multiplicative perturbation of realizations, which does not affect the inequality characteristics of an economy, the scale-adjusted τ0.95 is steady (see Fig. 2(d)).In DGP- A with the additive perturbations, it is significantly shrinking over time (see Fig. 2(b)).However, the performance of the limiting condition given by Eq. (3) seems to be less precise in the DGP-A case for smaller values of deviation observed in the fourth and fifth years.This can be connected to quite moderate MC iterations S = 500, as well as n 0 = 250 and b = 200 (within each iteration) used in the simulations,12 leading to a more substantial variance of potential realizations.It might also be implicitly caused by the optimization-based pre-estimation of rescaling constants { b}.
The MC results reveal that the suggested methodology can be informative about the total and scale-independent (inequalityaffecting) convergence of income distributions.13

Convergence of income distributions in the EU
We first evaluate the convergence of income distributions among the considered 27 EU countries 14 (EU27).Despite the original diversity of EU countries, the harmonization of institutions, together with the cohesion policy and structural funds, is supposed to create adequate capacity, mechanisms, and means fostering their economic convergence.Fig. 3(a) plots the statistic τ0.95 and the 95% quantile of the realized τ (s) 0.95 for unadjusted income.
The convergence of income distributions among the EU countries from 2007 to 2014 is clearly statistically significant, even in the aftermath of the financial crisis, which is consistent with the findings of Cabral and Castellanos-Sosa (2019).The largest convergence within a single year took place just after the financial crisis.The convergence seems to have become slower for several years afterwards, but it accelerated again after 2012, i.e., during the latest years of recovery in the EU. 15   14 Does not include Croatia.
15 The largest convergence in 2008 is mostly associated with the fact that the high-income North-Western (NW) economies in the EU were among the There is less convincing evidence on convergence of the inequality-affecting part of income distributions (see Fig. 3(b)): τ0.95 had an initial downswing after the crisis, with an upswing in the later years that seems to be inversely connected with the economic situation during this period.
We finalize the illustration with an intriguing example of convergence of income distributions between Central and Eastern Europe (CEE) and Southern Europe (SoE) (see the lower panel of Fig. 3).Considering the income distributions within these large macro-regions of the EU, 16 not only the convergence did take first to experience the real consequences of the financial crisis in terms of slower economic growth.Most lower-income countries from other EU regions experienced a larger slump of economic growth relative to the NW region later, mostly in 2009, thus resulting even in some divergence (see Fig. 3(a)).The speeding up of convergence since 2012 is mostly associated with the recovery of growth (in absolute and relative terms) in the lowest-income EU countries from Central and Eastern Europe, especially, as compared with the Southern European countries.16 Cseres-Gergely and Kvedaras (2019) contains the analogous evaluations of convergence between (and also within) other sub-groups of EU countries.
place during 2007-2014, but, using the scale-adjusted data, the hypothesis of full convergence in 2014 (and even earlier) cannot be rejected at the 5% significance level (see Fig. 3 (d)).Therefore, any scale-invariant inequality metric would become very similar for the two regions (for such evidence see, e.g., Benczur et al., 2017, andJRC, 2020).Thus, irrespective of the remaining differences in the scale of income, the CEE and SoE regions taken as a whole became quite similar in terms of income distribution.