Polarization Measurement and Inference in Many Dimensions When Subgroups Can Not Be Identified

The most popular general univariate polarization indexes for discrete and continuous variables are extended and combined to describe the extent of polarization between agents in a distribution defined over a collection of many discrete and continuous agent characteristics. A formula for the asymptotic variance of the index is also provided. The implementation of the index is illustrated with an application to Chinese urban household data drawn from six provinces in the years 1987 and 2001 (years spanning the growth and urbanization period subsequent to the economic reforms). The data relates to household adult equivalent log income, adult equivalent living space, which are both continuous variables and the education of the head of household which is a discrete variable. For this data set combining the characteristics changes the view of polarization that would be inferred from considering the indices individually.


Introduction
The functionings and capabilities approach to wellbeing measurement (Sen 1992) has given considerable impetus to multidimensional analyses of wellbeing (Grusky and Kanbur 2006). The argument is that individual wellbeing is not just a matter of the incomes they have or could achieve, among other things it depends on individual health and educational status, their political freedoms and environmental factors. In the absence of a well specified wellbeing aggregator of these many sensibilities (i.e. some form of utility function) evaluation of wellbeing has to be evaluated over these many dimensions which of course could be measured discretely or continuously.
The multivariate polarization measure presented here is founded upon the notion of polarization within a population distribution f(x) of individual characteristics x into potentially many possible groups which are not identified 1 a priori. Imagine for example a population which is a mixture of K classes with respective distributions f k (x) and proportions w k so that the population distribution f(x) may be written as: When no class identifier or information on agent membership of the f k (x)'s (i.e. the sub group distributions) is available, all that is observed is f(x), the population distribution. This is the unidentified case for which the polarization measures discussed herein are appropriate. Sometimes, given additional information, the sub distributions can be estimated facilitating calculation of the probability (or partial identification) of group membership for an individual with characteristics x or indeed perfect stratification where group membership is known qith probability 1. For example Anderson, Pittau and Zelli (2011) posit that the incomes of each class are driven by distinct stochastic processes which precipitate distinct log normal distribution specifications for the f k (x)'s permitting, through the employment of semi parametric techniques, estimation of all of the parameters of the mixture distribution. This permits estimation of the extent of polarization 1 Here "identification" refers to known membership of a group rather than a sense of kinship or proximity to other members in a group which is the sense in which it will be used later in defining the polarization measure. between any two classes in a similar fashion to when class membership is completely identified. Here no such information is available. Esteban and Ray (1994) and Duclos, Esteban, and Ray (2004) posited a collection of propositions with which such a Polarization measure for the unidentified case should be consistent and proposed a collection of univariate measures appropriate for a variety of circumstances that would reflect such polarization between potentially many groups. The propositions are based upon a so-called Cohesion (or Identification) and Alienation nexus wherein notions of polarization are fostered jointly by an agent's sense of increasing within-group identity or association and between-group distance or alienation.
There have been several proposed univariate polarization indices which focus on an arbitrary number of groups 2 in this unidentified case (Esteban and Ray, 1994;Esteban, Gradin and Ray, 1998;Zhang and Kanbur, 2001;Duclos, Esteban and Ray 2004) and a similar number that focus on just two identified groups i.e. when the sub distributions above are observed (Alesina and Spolaore 1997;Foster and Wolfson 1992;Wolfson 1994;Wang and Tsui, 2000). Anderson, 2004 considers tests for various types of polarization between two identified or partially identified groups based upon the anatomy of their respective distributions or the mixture of their respective distributions when the groups were only partially identified. While much work has been done on extending one dimensional wellbeing measures to many dimensions in the context of poverty (Duclos, Sahn and Younger 2006) and inequality measurement (Maassoumi 1986, 1999, Koshevoy and Mosler 1997, Tsui 1995and Anderson 2008) 3 little has been done in extending polarization measures to the many dimensioned case. While Gigliarano and Mosler, (2009) develop a family of multivariate polarization measures based upon measures of between and within group multivariate variation and relative group size which exploit notions of subgroup decomposability and Anderson (2010) and Anderson, Linton and Leo (2011) have developed a trapezoidal measure of polarization which can be applied to two identifiable groups or within a population distribution provided at least two modal points are identified (i.e. the partially identified case), multivariate polarization measures have not been developed for the more general non-identified many group case, nor _________________________ for the case where the joint distribution of sensibility indicators is a mixture of discrete and continuous variables. 4 An excellent summary of the properties of the univariate indices is to be found in (Esteban and Ray, 2007) wherein the properties of indices are evaluated in terms of their coherence with some basic axioms that reflect three broad notions, 1) When there is only one group there is little polarization, 2) polarization increases when within group inequality is reduced, 3) polarization increases when between group inequality increases. The axioms are formed around a notional univariate density that is a mixture of kernels f(x, a) that are symmetric uni-modal on a compact support of [a,a+2] with E(x) = μ = (a+1) also representing the mean or mode. However these axioms are readily extended to multivariate densities of continuous variables by thinking in terms of a notional multivariate density that is a mixture of multivariate kernels so that x is simply a j dimensioned vector. The kernels are subject to slides (location shifts) g(y) = f(y-x), which may be contemplated in terms of the Euclidean distance 5 between vectors y and x, and squeezes (shrinkages) of the form f λ (x) =f({x-[1-λ]μ}/λ)/λ (0 < λ <1) where now μ is a j dimensioned vector of means or modal values of the multidimensional kernel f(x,a) that is symmetric on a compact support of [a,a+2] where a is a j dimensioned vector.
Potential indices are evaluated in the context of such changes in terms of the extent to which they satisfy a set of axioms which reflect the following set of ideas. The squeeze of a uni-modal distribution cannot increase polarization and symmetric squeezes of the two kernels cannot reduce polarization. Sliding two kernels away from one another increases polarization and common population scaling preserves the polarization ordering. Polarization indices have to come from a family where if x and y are independently distributed with marginal distributions f(x) and f(y) then the index is the expected value of some function T(f(x),|x-y|) which is increasing in its second argument. Symmetric squeezes of the sub distributions weakly increases polarization. The index should be non-monotonic with respect to outward slides of the sub distributions and flipping the distribution around its support should leave polarization unchanged. Most of these ideas can be _________________________ 4 Furthermore extensions of the stochastic dominance techniques introduced in Anderson (2004), which really explore the anatomy of polarizing distributions, would prove cumbersome in many dimensions because it is not obvious how to define a sensible partition of the distribution across those many dimensions. 5 Other distance metrics (for example Mahalonobis 1936or Bregman 1967) could be employed. contemplated with respect to multivariate densities of continuous variables though there is some difficulty when multivariate densities of discrete variables are contemplated unless slides of the discrete outcome values are permitted and squeezes of the distributions contemplated in terms of transfers of mass between outcome values.
Here the most popular general univariate polarization indices for discrete (Esteban and Ray 1994), and continuous (Duclos, Esteban and Ray 2004) variables are combined and extended to describe the extent of polarization between agents in a distribution defined over a collection of many discrete and continuous agent characteristics. The univariate indices have been demonstrated to satisfy the aforementioned axioms. The implementation of the index is illustrated with an application to Chinese urban household data drawn from six provinces in the years 1987 and 2001 (years spanning the growth and urbanization period subsequent to the economic reforms). The data relates to household adult equivalent log income, adult equivalent living space, which are both continuous variables and the education of the head of household which is a discrete variable.

The Extension to Many Variables both Discrete and Continuous
The multivariate generalization of the Duclos, Esteban, and Ray (2004) (DER) Polarization index is, like DER, based upon the sample equivalents of the population concepts. For scalar continuous x with distribution function F(x) the DER index is given by: Some intuition for the index may be gained by thinking in terms of f(x) α as the degree of identification or cohesion experienced by an agent with income x (f(x) being and indicator of mass around x) and a(x) as the degree of alienation experienced by a person with income x where: , the area of a rectangle with height f(x) α and base a(x), is then the degree of polarization experienced by an agent with x and [1] corresponds to the average polarization experienced across the population of agents. DER, (in Duclos, Esteban and Ray 2004a) demonstrate the estimator of [1] to be asymptotically normally distributed with an asymptotic variance V given by: Their development of the variance formula is sketched in the Appendix. A similar discrete variable index is provided in Esteban and Ray (1994) and is given by: where π i is the sample weight of the i'th observation and K is a normalizing factor. Development of the polarization index was founded on a set of axioms that such an index should obey, the axioms concern changes (squeezes and slides) in the uni-modal sub distributions in the mixture distribution that is f(x). The resultant index reflects the two primary factors that underlay polarization, the alienation or distance between groups (given by |y-x|) and the association within a group (given by f(x) α ). Indeed the intuitive interpretation of P α as the average value of the areas of all possible trapezoids that can be formed under f(x) whose average height is f(x) α and whose base is |x-y| can be related to the trapezoidal index of polarization employed in Anderson (2010) to study multivariate poverty states and in Anderson, Leo and Linton (2011) to study multivariate convergence issues. Here α is a polarization sensitivity parameter 6 chosen by the investigator such that 0.25 ≤ α ≤ 1 with higher values of α corresponding to increased sensitivity. The same axioms can be applied when x is a vector and where ||x-y|| is the Euclidean distance between the vectors. 7 _________________________ 6 Note when α = 0 the index is in essence twice the Gini coefficient thus a similar value in the following would provide a multivariate version of a Gini like coefficient and its variance. 7 Anderson, Crawford and Leicester (2011) employ Euclidian distance in developing a nonparametric approach to multivariate welfare rankings.
www.economics-ejournal.org Let w i and z i be jointly distributed vectors describing the status of the i'th agent with w i being a k x 1 vector of continuous variables and z i being an h x 1 vector of continuous variables with i =1,..,n being the elements of the sample. The continuous variables all reflect wellbeing positively and for convenience are defined on R k + and the discrete variables are ordered integers reflecting positive wellbeing in the same fashion. 8 The joint density of the w's for a given configuration of z's is f z (w|z) and the joint probability of the z's is p(z) so that the joint density of the w's and z's for the i'th agent with continuous characteristics w i and discrete characteristics z i is given by f(w i ,z i ) = f i (w i |z i )p(z i ) which corresponds to her degree of identification. As for the alienation component let x i be the stacked vector w i | z i then the dimension normalized Euclidean distance 9 between agents i and j given by ||x i -x j || is well defined and may be written as: Here summation is over the domain of each element of the z vector and integration is over the domain of each element of the w vector. As in the univariate case the alienation or distance between groups is given by ||y-x|| and the association within a group given by f(w,z) α in exactly the same fashion. 10 By employing kernel estimates of the conditional multivariate distributions and sample estimates of the population proportions p(z) the sample equivalents, given n observations on Q variables in an n x Q matrix X with typical element x iq i = 1,.., n, q = 1,..,Q and typical row x i the index can be seen to be: The multivariate version of [2], the variance of index is given by: Where after ordering the vectors x i on ||x i || as x i o , the first, second and third terms of the i'th element of the variance vector may be respectively estimated in an obvious fashion as: 10 Note that Esteban and Ray (1994) and DER respectively offer different ranges for α for discrete univariate and continuous univariate distributions this can be accommodated in the present context by considering the association component as f(w|z) αc p(z) αd where αc is the polarization parameter for the continuous components and αd is the polarization parameter associated with the discrete components.
Essentially the generalization simply involves employing the dimension normalized Euclidean norm for |y-x| and |y| when they are Q dimensioned vectors together with multivariate kernel estimates of f(w|z)p(z) for f(x) and f(y) raised to an appropriate power value of α, the polarization sensitivity index, which is of course the choice of the investigator.

An Application: Chinese Urban Households 1987-2001
There is a suspicion that the economic reforms and one child policy initiated in China in the late 1970's together with the massive urbanization over the period changed the nature of urban households and families. The One Child Policy (OCP) intervention changed fundamentally the nature of both existing and anticipated marriage arrangements and influenced family formation decisions in many dimensions. Anderson and Leo (2007), in studying the impact of the policy on family formation in urban China construed the OCP as a rationing policy constraining the quantity (but not the quality) of children, evidence of increased positive assortative pairing of couples was observed as was increased investment in children and, also consistent with rationing theory, income became less of a factor in determining family size though it did become an increasingly important determining factor in investment in children. At the same time there was an unprecedented growth in household incomes largely attributed to the economic reforms (the average annual growth rate of city incomes over the period 1990-1999 was over 18%, Anderson and Ge 2004) and a massive migration to the cities (in 1985 20% of the Chinese population was urbanized, by 1999 over 42.6% was urbanized, Anderson and Ge 2005). All of which could have changed substantially the way that households relate to one another, one aspect of which is the extent to which households are polarized. Table 1 presents the summary statistics of data on two independent surveys of urban households from three coastal and three interior provinces 11 in China for the years 1987 (for which there were 3651 observations) and 2001 (for which there were 4297 observations) a period over which the reforms took effect. The data were used to generate observations on log adult equivalent household income (at constant prices), adult equivalent 12 living space (in square meters) and an integer index of the education level of the head of household. Thus in this example the household is the agent.
The marginal distributions of the three polarization characteristics exhibit quite distinct structures and changes over the period. When these variables are put together in a joint distribution its anatomy is likely to change over the period and given its central role in the polarization calculus the nature of polarization is likely to change. A considerable reduction in family size (the effect of the one child policy) and considerable increases in both equivalent incomes and living space (due in part to growth and in part to reductions in family size) and educational attainment are evident. The variation in incomes, living space and education also increased over the period (suggesting a diverging society) whereas the family size was clearly converging. Incomes and education are negatively skewed (long upper tail) incomes increasingly so and education decreasingly so over the period. Living space is positively skewed and decreasingly so over the period. Family size actually switched skew from negative to positive over the period. With the exception of family size and education, and family size and income, correlations _________________________ 11 The coastal provinces were Jilin, Shandong and Guangdong the interior, Sichuan, Shaanxi and Hubei . 12 Equivalization was effected using the square root rule (Brady and Barber 1948 between the variables have become uniformly stronger over the period. The family size/income correlation has actually switched sign as though children have changed from being an inferior good to being a normal or luxury good over the period. To calculate the polarization statistic the continuous multivariate mean standardized pdf's were estimated using a multivariate standard normal kernel with a window width h = 1.06*σ(x).*n -(1/(4+k)) (Silverman 1986). The seven outcome educational scale was condensed to a three outcome scale, 1, 2 and 3 corresponding to high, medium and low educational attainments. Table 2 reports the univariate polarization indices and standard errors for the continuous and discrete measures as per DER, Table 3 reports the paired multivariate measures and Table 4 reports the overall multivariate measures.
For all values of the polarization sensitivity parameter the index shows increases for all income, house space and education variables and, based upon the samples in the two years being independent of one another, the increase is seldom insignificant at usual levels of significance though the differences do appear to diminish as the polarization sensitivity parameter increases. Increasing the sensitivity parameter can be interpreted as increasing the relative influence of the identification component relative to the inequality or alienation component suggesting that polarization changes in the univariate case are largely a result of changes in the alienation component.
The joint pair-wise distributions reported in Table 3 exhibit quite different effects to the univariate cases. At low levels of polarization sensitivity significant polarization is still the norm for all pair-wise comparisons with depolarization being the norm in almost all cases and significantly so as higher orders of polarization sensitivity are considered. Thus pairing or combining variables appears to dilute the alienation effect even further.
Turning to the polarization measures across all three characteristics which, together with a test for depolarization, are reported in Table 4, note that the null of depolarization is never rejected for all levels of polarization sensitivity. Furthermore the differences, which are now reductions in polarization, are more substantial the more heavily weighted is the identification component. So it appears that expanding the dimensions over which polarization is considered further denudes the polarization observed when the various household characteristics are considered individually.

Conclusions
Many researchers have argued that, in the absence of a plausible aggregator of the many factors that affect wellbeing, its measurement needs to be pursued in the context of the several variables available rather than relying on just one of them. This applies to most aspects of wellbeing measurement. Here, by combining multivariate versions of the Polarization indices developed in Esteban and Ray (1994) and Duclos, Esteban, and Ray (2004) the polarization measurement toolkit has been extended to the case where the status of an agent is represented by many characteristics which can be both discretely and continuously measured and the agent subgroups in a population are not identified. The asymptotic variance of the statistic has been provided to facilitate inference.
As an example the statistic was applied to Data on two independent surveys of urban households from three coastal and three interior provinces in China for the years 1987 and 2001, a period over which economic and family size reforms took effect and a period over which there was extensive urbanization. The data reflected the log adult equivalent household income (at constant prices), adult equivalent living space (in square meters) and an integer index of the education level of the head of household each of which may be construed as contribution to the wellbeing of the household.
The results, while obviously specific to these particular data, were salutary with regard to the use of univariate as opposed to multivariate polarization indices. While the individual univariate indices all reflected significant increases in polarization between households over the period of the reforms, when they were combined the polarization result was attenuated. For pair-wise combinations of the variables significant polarization was detected at low levels of polarization sensitivity but at high levels of polarization sensitivity significant depolarization was detected. When all three variables were combined in an index, significant depolarization was detected at all levels of polarization sensitivity. Applying the law of large numbers to [A1} and noting that root n times [A1] has expectation 0. Invoking the central limit theorem and collecting and re-arranging the terms yields the variance formula [2].