Identity-based Earning Discrimination among Chinese People

Abstract Hukou registration is an instrument to control nonplanned population and capital movements, which the Chinese Communist Party has been exploiting extensively since the 1950s. It requires that each Chinese citizen be classified as either an agricultural or nonagricultural hukou inheritor and be distinguished by their location with respect to an administrative unit. Hukou distribution used to be entirely determined by birth, but nowadays, Chinese citizens can self-select their hukou status based on their ability that causes selection bias in conventional wage decomposition by hukou types. To avoid this bias, I estimated hukou-based earning discrimination by matching Chinese individuals based on a rich set of individual-, family-, and society-level characteristics. By deploying a recent nationally representative dataset, this paper finds that significant earning discriminations exist against agricultural hukou people. I further investigated the impact of hukou adoption within work ownership, work and employer types, and labor contract conditions. I argue that earning difference by hukou is not due to rural–urban segregations; rather, it is systematic and institutionally enforced. This is because, contrary to self-employment and no labor contract conditions, discrimination exists only when others employ them and where a labor contract condition is enforced. Moreover, they face discrimination only when they work for the Chinese government, not when they work for private firms, and they face higher discrimination in nonagriculture-related professions compared to agriculture-related professions.


Introduction
China's transition from the planned economic regime to a liberal economic regime has marked a remarkable welfare improvement for Chinese people. One of the key features of this transition process is the gradual dismantling of traditional institutions. The transition is not complete yet because the traditional institutions that historically regulated the flow of capital and labor are still playing a significant role in the Chinese society (Lardy, 1998;London, 2014). One such institution is the hukou or household registration system. 1 The system was designed to strictly limit people's mobility within the Chinese mainland (Song, 2014). It was considered a necessary component of the centrally planned economy because a strict central plan requires the ability to allocate resources not only at the enterprise and sectoral levels but also across geographic locations (Liu, 2005). However, now it has been identified as a major obstacle to China's quest to become a modern developed economy (Chan, 2009). It is the most pressing area for policy reform. Many have also identified the hukou system as the direct cause of extreme economic and social inequality in China (Afridi et al., 2015;Liu, 2005;Wu and Treiman, 2007).
Other significant negative consequences include differential accessibility to public services and welfare programs, different costs of living in cities, and discriminatory labor market treatment (Song, 2014). It is also responsible for large productivity loss (Au and Henderson, 2006), middle-income trap (Zhang et al., 2013), and nonoptimal allocation of resources (Whalley and Zhang, 2011).
The majority of the existing research has attempted to partition the wage differential between the two hukou groups into components caused by two factors: a difference in productivity and an unexplained component that is often referred to as discriminations (Blinder, 1973;Oaxaca, 1973). 2 Most papers mainly compared wages between urban residents and rural-urban migrants and estimated the different magnitudes of the unexplained wage gap.
For example, Song (2016) observed discrimination against rural-urban migrants in stateowned enterprises (SOEs) and private firms within urban China using the 2008 wave of the Rural-Urban Migration in China survey. By applying regression and decomposition technique, he found that urban hukou holders earn about 50% more than rural hukou holders do in the SOEs, but only 5% more in the private sector. 3 While this literature is full of merits and a valuable resource to understand the hukou-based earning discrimination, their regression-based 1 Hukou has two main classifications: one by hukou type and other by hukou location. While the hukou type is separated into "agricultural" and "non-agricultural" hukou, the hukou by location is that each person is also categorized according to his/her place of hukou registration. The hukou types and locations define citizen's eligibility for public services in a specific locality. The place of registration is determined mainly by birth, and this is the individual's official and only "permanent" residence (Chan, 2009). Until 2003, a newborn baby's hukou type followed the mother's rather than the father's (Wang, 2004). Later, if parents differ in their hukou types, their children can freely choose hukou types either from father's or mother's hukou (Huang, 2012). To attract outstanding migrants, different cities/provinces have introduced different programs that allow people to convert their hukou types from agriculture to non-agriculture and transfer from one place to another place. Cities usually set a limit on how many hukou they want to offer each year, and criteria of alteration are entirely ability based. Thus, ordinary people are not usually able to alter their hukou. There are also cases where people do not want to convert their hukou because they are not willing to give up their contract land that is tied to their agricultural hukou types (Chen and Fan, 2016). 2 See Song (2014) for a review of existing literature. 3 Others such as Lee (2012) found 28% (using China's Urban Labor Survey in 2005), Gravemeyer et al. (2011) 52.9% (using Shenzhen Survey 2005 data from Shenzhen city), Gagnon et al. (2012) 40% (using Census 2005, and both Deng (2007) and Liu (2005) 60% (using China Household Income Project (CHIP) 2002 and Beijing sampled from the CHIP in 1995, respectively) of the average wage difference cannot be explained by observable characteristics. They called the unexplained portion of the wage gap as discrimination. Gagnon et al (2012) confirmed that hukou-based wage discrimination is only attributed to hukou type, rather than to Hukou location. Note that there is one research conducted by Dong and Bowles (2002) who claim that there is no discrimination against rural migrants. empirical analyses, including Song (2016), suffer from several major limitations. First, no existing studies considered the fact that the hukou status may not be exogenous because outstanding rural-urban migrants can be granted urban hukou. So, the hukou status distributions may not be random because the status can be self-selected based on ability. 4 Second, datasets that have been used in the existing studies do not allow controlling many important factors in their parametric estimation including ability, social and family background, health, attitude, ambition, work effort, networking, and so on that may influence labor market outcomes (Bowles et al., 2000(Bowles et al., , 2001a(Bowles et al., , 2001bTaubman, 1976;Filippin and Paccagnella, 2012;Heineck and Anger, 2010;Heineck, 2011). Again, ability measures have multiple dimensions such as mathematical, educational, language, communication, and cognitive ability. 5 Standard earning equation, which includes a person's age, schooling, and experience and parents' schooling, occupation, and income, cannot explain two-thirds to four-fifths of the variance of earnings (Bowles et al., 2001b). So, the existing literature is likely to suffer from omitted variable biases.
Third, past empirical results were not also well representative of both types of hukou people and of China as a whole. Earlier surveys were residence based, and others were conducted in few cities in urban China where it was difficult to obtain a representative sample of agricultural hukou people except rural-urban migrants who lived in the urban areas. Even majorities of the rural-urban migrants were not well covered because they mostly lived in construction sites or workplace dormitories and hence may not be included in the residence-based samples. 6 Thus, the residence-based surveys may be biased due to the omission of migrant people, and urban-focused surveys may be biased due to the exclusion of agricultural hukou people living in rural areas. Moreover, their datasets were relatively old, which may not reflect the recent state of the Chinese labor market.
This paper addresses all these gross limitations listed above and provides improved evidence of hukou-based earning discrimination using comprehensive nationally representative survey data. While existing literature examined only earning differences between urban hukou people and rural-urban migrants, this paper first estimates overall earning differences between urban hukou and rural hukou people and then verifies whether earning differences change if rural hukou people are migrants. We observe both wage and total income where total income includes both wage and nonwage earnings. 7 I also estimated earning differences by hukou types within work ownership types, work types, employer types, and labor contract conditions. To my knowledge, I am not aware of other studies in the literature that studied hukou-based discrimination by applying similar approaches. I applied 4 Estimating hukou-based discrimination is different from estimating gender-based or ethnicity-based discriminations where the subjects cannot change their identity, which have less chance of self-selection bias. Therefore, models that are exclusively designed to estimate gender-and ethnicity-based discrimination such as Oaxaca-Blinder (OB) decomposition should not be directly applied to estimate hukou-based discrimination because results can be misleading. 5 For example, Chen and Hoy (2011) conducted face-to-face interviews with personnel management in 21 companies in Shanghai and found that they are concerned about the fact that people with rural hukou have local accents causing difficulties in communication, lack trust, and are highly mobile. These characteristics related to productivity may explain some of the remaining wage differentials. 6 For example, the CHIP data, which were used by Démurger et al. (2009) and Deng (2007), were collected from residential communities. Others such as Knight et al. (1999) used data from four cities in 1995, Gravemeyer et al. (2011) utilized survey from Shenzhen in 2005, Dong and Bowles (2002) used data from only two cities such as Dalian andXiamen in 1998, andLee (2012)  a nonparametric propensity score matching (PSM) strategy, which is a common program impact evaluation model stemming from Rosenbaum and Rubin's (1983) seminal work. The matching method should produce better results and offer the promise of causal inference with fewer assumptions (Guo and Fraser, 2015;Ho et al., 2007). Using nationally representative survey data, I found that hukou-based earning discrimination exists against agricultural hukou people. They only experience discrimination if others employ them; on the other hand, if they are self-employed, no statistical discrimination is discernable. Discrimination exists in both agriculture-related and nonagriculture-related professions; however, the worst discrimination exists in nonagriculture-related professions. They face earning disadvantages only in government jobs but not in private jobs. Agricultural hukou people experience higher wage discrimination where a labor contract is enforced compared to where a labor contract is not enforced. Within labor contract conditions and work types, if wage income dominates the total income, discrimination is evident; however, if nonwage income dominates the total income, discrimination is not apparent.
The findings confirm that earning difference by hukou is institutionally determined rather than simple rural-urban isolation. 8 We can call it discrimination 9 because this paper sheds further light on the extent of employer discrimination by hukou types by comparing the earnings of self-employed workers to their salary worker counterparts. If employer discriminations were a principal source of discrimination against agricultural hukou people, then we would expect the agricultural/nonagricultural hukou earnings ratios to be higher for the self-employed people compared to the people who are employed by others. argue that the public sector is nondiscriminatory because of the bureaucratic rules and regulations designed to ensure nondiscriminatory hiring and upgrading of public employees (Long, 1975). Alternatively, employer discrimination should be least in private sectors under competition because nondiscriminators can gain a competitive advantage by substituting lower cost minority workers for relatively higher cost mainstream, ceteris paribus (Arrow, 1973;Becker, 8 The earning difference by identity types in China is interesting because of its distinctive path dependency to state socialism. The socialist institutions such as the hukou system influence the welfare regimes very distinctively where the tools of the market economy are constrained by these socialist institutions (London, 2014). Chinese cities are experiencing labor shortage and recruiting illegal non-Chinese workers (Chan, 2010) while hundreds of millions of people stay in rural areas and earn very low incomes. Evidence also shows that there is surplus low-wage labor in the rural areas (Cai, 2007;Fields and Song, 2013;Knight et al., 2011;X. Zhang et al., 2011). This situation raises a simple question of why these rural workers choose not to migrate for higher paying jobs and thus had to work in the agricultural sector for considerably lower earnings. The reason is migration is costly and they face discrimination in the labor market due to their hukou status. 9 The labor market discrimination is defined as the valuation in the market place of personal characteristics of the worker that are unrelated to worker's productivity (Arrow, 1973). There are three main forms of labor market discriminations: wage discrimination, hiring discrimination, and premarket discrimination. While the wage discrimination is that two groups of workers with equal productivity receive different wages (Becker, 1957), the hiring discrimination is that they receive different access to the same jobs (Bertrand and Mullainathan, 2004). The premarket discrimination means comparable workers receive different levels of education and training that are likely to influence people's future labor market outcomes (Neal and Johnson, 1996). 10 See the work of Moore (1983) for further details. In the literature, people who are employed by others are called salary and wage workers. I used employed by others to be consistent with language of questionnaire. 1957). 11 However, we can expect that if higher discrimination exists in public employment, it is more institutionally designed compared to if higher discrimination exists in private employment, it is perhaps for low competition. There is also no theoretical foundation for discrimination in agriculture versus nonagriculture-related jobs. However, in China, nonagricultural sectors are more directed by the government than agricultural sectors; thus, if higher discrimination exists in nonagricultural professions compared to agricultural professions, it indicates institutional fault lines.
The next section presents the background of the hukou system transformation. Section 3 analyzes the problems with the parametric methods, proposes an alternative method, and describes data. Section 4 covers the empirical results, and section 5 concludes the paper.
2 Background of the hukou system and reforms The modern hukou system, which was implemented in the late 1950s, is a modified version of the household registration system that has been a part of China for thousands of years.
The purposes of the original and the current system are not different -restricting population movement and removing troublemakers, broadly maintaining social peace and order. 12 In the post reform, cheap labor has been a key factor of recent economic success for China and hukou is the instrument that helps the government maintain cheap labor for factories that are mostly owned by the government (Chan, 2009(Chan, , 2010Mackenzie, 2002;K. H. Zhang and Song, 2001). The local government officials believe that agricultural hukou people are the sources of cheap labor they need to grow their economies so that they can be promoted to higher positions. Thus, they oppress the migrant workers with lower wages and limited bargaining power.
Another incentive of maintaining the hukou system is that if more people are settled in urban areas and if governments have to provide urban benefits such as education, health, housing, police, and others to all, it will be expensive as agricultural hukou people usually have a limited tax base. This is why the urban local governments will only accept outstanding migrants as their permanent residents.
To take advantage of the cheap labor, the Chinese Communist Party (CCP) began to lessen the strictness of the hukou system in 1978 when China first embraced the reforms of the market economy. 13 Since then, China has been experiencing a dramatic change in the labor market. Workers, for the first time, could migrate to other cities or regions for employment purposes. In recent years, through a gradual reform process, the hukou system 11 Discriminatory firms should not be able to compete in the market for long run. On the other hand, competitive pressure and cost consciousness are generally less intense in the public organization, which implies that larger earning differential should be expected in public employment (Long, 1975). 12 Many others say that the primary logic of the hukou system was based on a division of labor. While, on the one hand, agricultural citizens are given access to arable land, non-agricultural citizens, on the other hand, rely on the state's allocation of resources and jobs and are expected to industrialize the nation (Chen and Fan, 2016). The reforms are since 1980s, which allowed people to migrate; now the "agricultural" and "non-agricultural" hukou distinction does not bear any necessary relationship (Cheng and Selden, 1994). There are more than hundreds of millions of Chinese working in urban industry while maintaining their "agricultural" hukou status (Fields and Song, 2013;Song, 2014). 13 The hukou system reforms can be divided into two broad phases: phase I  and phase II (1979-present). In phase I, the agricultural hukou holders were confined to agricultural works and they were not allowed to go to cities for working purposes (Chan, 2012). During this time, migration was limited including illegal immigration (H. X. Wu, 1994). The legal migration process was complex. During this prereform phase, the annual change rate was kept at a very low level of 0.15% to 0.20% (Y. L. Lu, 2002). In phase II, which started in the early 1980s, China implemented many policies that transferred the responsibility of managing the hukou system to local governments. Now local governments have full authority to set their own local hukou admission criteria (Song, 2014). has been significantly relaxed by the local government authorities who were devolved the responsibility of managing the hukou system in the early 1980s (Cai, 2011;Chan, 2009;Song, 2014). In many cases, the cumulative effect of these reforms is not the abolition of the hukou system; it makes permanent migration of rural farmers to cities harder (Chan and Buckingham, 2008). The hukou system remains essentially potent and intact (Chan and Buckingham, 2008), and there is a unique labor market intervention that the CCP can play through the hukou system, which favors nonagricultural hukou over agricultural hukou inheritors.
Different city governments initiated different point-based hukou conversion systems. Big cities such as Beijing, Shanghai, Guangzhou, Shenzhen, Chengdu, and others make it tough by grading an application according to a point system based on an applicant's education level, tax payments, birth control, and work experience (Sheehan, 2017). In addition to education, higher tax return, and work experience, some local governments require an urban capacity fee, investment, and home purchase. Some cities introduced blue-stamped hukou, a special type of hukou different from regular red-stamped hukou, targeting migrants who can afford to purchase a house in their cities and whose talents are urgently needed for the city (L. Wu, 2013). Blue-stamped hukou can be converted to regular nonagricultural hukou in 3-5 years depending on city policies. There were also many hukou selling programs led by local urban governments to increase their monetary returns (Chen and Fan, 2016). By 1993, more than 3 million nonagricultural hukou were sold in exchange for 25 billion yuan (Han, 1994). The price of the hukou type and location vary. For example, 10,000 and 15,000 yuan used to be charged to nonagricultural hukou (transfer) and agricultural hukou (conversion), respectively, to purchase local hukou in Xiamen in 1993 (Chen and Fan, 2016). On the other hand, some cities, usually less attractive, offered monetary incentives to attract people. For example, according to Cixi's policy in 2008, if local agricultural hukou people convert to nonagricultural hukou status and give up their contract lands, they could obtain 24,300 yuan so that they can fully cover their pension insurance premium and subsidies for social insurance, loans, and urban housing. A 4,000 yuan reward was also available for doing these voluntarily in the province of Zhejiang (Chen and Fan, 2016). 14 There are many other examples of hukou reforms in China. Thus, the hukou conversion and transfers range from very difficult (Beijing, Shanghai, etc.) to less difficult (Sichuan, except in the city of Chengdu) to even monetary rewards (cities of Zhejiang), depending on the attractiveness of the cities and the purpose of the city governments. The burdensome process of applying for urban hukou and the low likelihood of finding affordable housing in a first-tier city have discouraged many low-income migrants from taking advantage of these reforms.
These reforms helped only well-educated elites to obtain nonagricultural hukou in big cities.
Those who have been able to convert their hukou have experienced a large increase in subjective well-being (Tani, 2017). However, evidence shows that, although there is an economic cost associated with, there are no subjective well-being costs for holding an agricultural hukou (Asadullah et al., 2018).
14 Here the government's main incentive was land acquisition.
3 Empirical strategy and data

Problem with the existing parametric empirical strategy
The simplest way to find the earning difference by hukou types is by estimating equation ( Earlier literature on discrimination studies and specifically hukou-based discrimination used the decomposition approaches developed by Oaxaca (1973) and Blinder (1973). 15 However, saying the entire unexplained portion is discrimination may not be appropriate because of the limitations with the OB decomposition approaches: misspecification due to differences in the support of the empirical distributions of individual characteristics for hukou status differences.
While this is a problem largely recognized in the program impact evaluation literature, it has not received much attention in the analysis of wage discrimination (Heckman et al., 1997;Ñopo, 2008). Dolton and Makepeace (1987) and Munro (1988) identified the limitations of 15 The wage differentials can be analyzed using equation (ii).
where R and U superscripts refer to agricultural and non-agricultural hukou people, respectively. W stands for wage/ income. β are the linear estimates of the parameters β from the wage equation. It generates the counterfactual: what would an agricultural hukou inheritor earn if compensation scheme for his/her individual characteristics aligned with that of a non-agricultural hukou inheritor? Based on this counterfactual, the OB decomposition separates the wage differential between an agricultural and non-agricultural hukou holder into two components: explained and unexplained portions of the wage gap.
is the explained portion of the earning gap, which can be attributed to differences in the mean observable characteristics. Xˆ R is the unexplained portion of the earning gap, which is often attributed to discrimination or different returns on productivity characteristics (L. Lee, 2012;Meng, 1998;Ñopo, 2008). the original OB model that the wage gap decomposition is only informative about the average unexplained earning differences but not about its distribution. To address the distributional limitation, Buchinsky (1994) proposed the estimation of quintile earning equations and Jenkins (1994) and Hansen and Wahlberg (1999) proposed using generalized Lorenz curves for both observed earnings and predicted counterfactual earnings. However, their proposition still ignores the problem of hukou status differences in common supports. There are combinations of conditions and characteristics for which it is possible to find some people in the labor force while others are not with their hukou status. Again, people with agricultural hukou may concentrate on certain occupations that require access to certain resources such as land for agriculture professions and nonagricultural hukou people are more likely to be in the managerial occupations that require long tenure. Thus, it is not accurate to compare earnings across hukou status. 16 So, the OB decomposition fails to recognize the hukou status differences in the support and distribution of explained earning differences while estimating earning equations for all working Chinese without restricting the comparison only to those individuals with comparable characteristics. The OB decomposition without this restriction is implicitly based on an "out-of-support assumption", which becomes necessary to assume that the linear estimators of the earning equations are also valid out of the supports of individual characteristics for which they are estimated. This assumption should overestimate the component of the gap attributable to differences in the rewards (Ñopo, 2008).
Alternatively, we could treat hukou adoption as endogenous and use an instrumental variable estimator (IV). However, we don't have a good instrument in the dataset. Moreover, IV estimation procedures impose arbitrarily a linear functional form assumption that coefficients on control variables are restricted to be the same for both hukou types (Heckman and Navarro-Lozano, 2004;Jalan and Ravallion, 2003). Hence, we don't pursue this approach.
Another parametric solution that allows a full set of interaction effects via the Heckman's selection correction model that come at the cost of imposing strong distributional assumptions such as the unobserved determinants of wage/income and hukou adoption are jointly normally distributed, with zero mean, constant variance, and a covariance term (Main and Reilly, 1993). Therefore, in the next section, I deal with an alternative nonparametric matching strategy to remove the restrictive assumption.

PSM estimator
PSM can overcome the selection bias problem and provide valid estimates of average treatment effects (ATE) as well as average treatment effects for the treated (ATT) with fewer assumptions (Guo and Fraser, 2015;Ho et al., 2007). 17 Although matching techniques concerned with the comparison of groups with similar characteristics have been of special 16 This situation is analogous to the case of comparing wage across gender while there are male-and female-dominated occupations as discussed by Deutsch et al. (2005) about occupational segregation by gender in Latin America during the 1990s. 17 Rosenbaum and Rubin (1983) defined PSM as the conditional probability of assignment to a treatment given a vector of observed covariates. It does not require any estimation of earning equations and hence no validity-out-of-the-support assumptions.
interest to experimental design and statistics, they have also been widely used for nonexperimental designs (Ñopo, 2008). Rosenbaum and Rubin (1983) defined the ATE (n) in a counterfactual framework as where Y i R and Y i U denote the earning of individual i with rural and urban hukou, respectively. In estimating the impact in equation (iii), a problem that arises is due to the fact that either Y i R or Y i U is observed, but not both for everyone. What is generally observed can be written as Accordingly, we can rewrite the expression n as follows: where P is the probability of observing an individual Chinese with D = 1 and n is the ATE.
Equation (v) says that the impact of hukou system for the entire sample is the weighted average of the effect of hukou adoption in the two groups of individuals, those with agricultural hukou status or treated (the first term) and those with nonagricultural hukou status or controls (the second term), each weighted by its relative frequency. The main problem of arguing discrimination as causal inference stems from the fact that the unobserved counterfactuals, E (Y R |D = 1) and E (Y U |D = 1), cannot be estimated (Becerril and Abdulai, 2010;Heckman et al., 1998;Mendola, 2007;Smith and Todd, 2005). In an ideal case, experimental data would provide us with information on the counterfactuals that should solve the problem of causal inference. This is not such an ideal case, and available data do not provide information on the counterfactual situation, thus creating a problem of missing data (Blundell and Dias, 2000).
This missing data situation requires estimating the direct consequence of the hukou registration system from the variation of income across individuals using statistical matching techniques. The current paper attempts to address this central problem of missing information on the counterfactuals by using the PSM method that summarizes the characteristics of everyone into a single index variable that determines the hukou registration status. Then, it uses this propensity score to match similar individuals (Rosenbaum and Rubin, 1983) to isolate the hukou registration effect from other socioeconomic determinants of individual-level income.
The PSM can be expressed by Here, F{n} can be logistic cumulative distribution and X is a vector that determines the hukou status. There are several assumptions for estimating the hukou effect on earnings using PSM. The first is the conditional independence assumption that states that hukou status inheritance is random and uncorrelated with income, once we control for X (Mendola, 2007). The second assumption is that ATE is only defined within the region of common support. This implies that the propensity score must be within 0 and 1; although this is done at the cost of reducing sample size, the quality of matches improves significantly by excluding the tails of the distribution of p(X). Third, we assume that the selection to urban hukou was observable since the administrative offices that grant urban hukou to rural Chinese people based on qualifications such as education, income, and others are observable individual characteristics.
Officers do not have an opportunity to consider any unobservable characteristics that may favor someone to get an urban hukou. These assumptions ensure that people with the same X values have an equal possibility of being in both hukou groups (Heckman et al., 1997 (Smith and Todd, 2005). Caliper matching helps avoid bad matches, which is a crucial part of the successful application of the PSM approach.

Data and descriptive analysis
To obtain a generalizable result that has policy relevance should depend on whether sample data and their sampling procedures are well representative (Brunswik, 1956;Cappelen et al., 2015;Hogarth, 2005;Kruskal and Mosteller, 1979). This is more relevant for research in China where data can be manipulated to support authoritarian regimes as well as for its large geo-boundary and economic and regional fragmentation (Jian et al., 1996;Poncet, 2005;Wallace, 2016). This paper uses the survey wave of 2014 of the CFPS, which is the largest and most comprehensive social survey in China (Xie and Hu, 2014, p. 26) and has not been used in any earlier studies of hukou-based earning discrimination. 21 The survey obtained its nationally representative 18 Caliendo and Kopeinig (2008) suggested that the basic idea is to compare the situation before and after matching and then check if there are any remaining differences after conditioning in the propensity score. After matching, there should be no systematic differences in the distribution of covariates. However, there still can be some "hidden bias" if there are unobserved variables that simultaneously determine the adoption of hukou and earnings. Thus, the use of a sensitivity analysis is necessary to address the issue (Rosenbaum, 2002). 19 Guo and Fraser (2015, pp. 48-50) emphasized the importance of estimating appropriate treatment effects using appropriate methods suitable for research questions. In this paper, the estimation of ATE is more appropriate because I am interested in the effect of hukou status at the population level where every individual Chinese is either agricultural or nonagricultural hukou Chinese. This research question is fundamentally different from medical research where ATT is more common because of specific medical intervention to a few people. Moreover, ATE is more efficient when there is sufficient overlap (common support) in the distribution of the estimated propensity scores (Hirano et al., 2003;Rubin and Thomas, 1992). 20 NNM can be applied with a replacement or without a replacement. Matching with a replacement involves a trade-off between bias and variance. If we allow replacement, the average quality of matching will increase, and the bias will decrease. The caliper/radius matching works in the same direction as allowing for replacement where bad matches are avoided and hence the matching quality increases. the remaining 20 provinces, one independent sampling frame was used (Xie and Hu, 2014).
Wide-ranging information was collected through computer-assisted person-to-person interviews of all family members, and methods learned from the most influential survey projects in the world and their experiences (Xie and Hu, 2014) were utilized. 22 Map 1 displays sample distribution across Chinese provinces. While the background color codes represent the Chinese population per square kilometer (blue to yellow to red colors shows low to medium to high population density, respectively), every black triangle dot represents 10 sample individuals. It clearly shows that more sample individuals were interviewed in highly dense areas. The eastern part of China is highly populated, and thereby, more interviews were taken place there.

Table 1 Continued
Although there is an equal number of males and females in the agricultural and nonagricultural hukou groups, their other demographic characteristics are different. While nonagricultural hukou people are older and have poorer self-reported health status, they have a lower BMI. For most categories of ability measures, nonagricultural hukou people stand higher than agricultural hukou people. Nonagricultural hukou adopters have a higher level of education and mathematical and language abilities. However, agricultural hukou people have a higher memorizing ability. While word and math tests are scores of respondents' cognitive ability based on the cognitive tests, memory test is the score of measuring an individual's ability to remember important things that happen to him/her within a week (Xie and Hu, 2014). There is no significant difference between the number of current school-going people by hukou types.
More nonagricultural hukou people are in migrant status than agricultural hukou people, and the difference is 28%, which is substantial. For the ambition group, while networking is not different by hukou groups, agricultural people reported that they are more confident about their future life than nonagricultural hukou people. By attitude, while the size of an ideal number of children is larger for rural hukou people, there is no significant difference between the two hukou groups whether they consider people are helpful/selfish and trustworthy and whether they trust their neighbors.
In terms of family background, a higher percentage of agricultural hukou people are in a marital relationship and they reported that they have a higher social status in their locality.
While agricultural hukou people spend more time doing household works, nonagricultural people spend more time watching television. Considering socioeconomic background, higher CCP membership belongs to nonagricultural hukou and agricultural hukou holders reported that they face more unfair treatment by the government due to their low level of personal wealth. In terms of employment, agricultural hukou holders are more employed by 20% and in agricultural sectors by 45% than nonagricultural hukou holders. However, nonagricultural hukou people are more in retirement by 23%.

Empirical result analysis
This section presents the empirical results on the discriminatory consequences of the hukou system on the earnings of the Chinese citizens. In a counterfactual framework, the question would be as follows: how would picking out an individual at random in our sample and, going back in history, changing his/her hukou status alter his/her current earnings? I also investigated the impact of hukou within work ownership, employer types, work types, and labor contract conditions. I was interested in causal effect; thus, I selected PSM as the most appropriate estimation techniques here, however, I started with OLS multivariate estimation as baseline results and then moved to PSM estimation. Before we move to results, we should understand the dependent variable. We observe wages as well as income where income adds up wage earning and capital earning. Note that, contrary to standard wage/income function which is often in logarithmic form, we estimate both the parametric and nonparametric models on raw wages/incomes. This is because log-normal wage equation suffers from at least three problems: (i) the bias created by the logarithmic transformation, (ii) the failure of the assumption that all error terms have equal variance (homoskedasticity), and (iii) the sensitivity of research results to zero-valued wage/income (Burger et al., 2009;Flowerdew and Aitkin, 1982). Table 2 reports a straightforward comparison of annual wage and income by hukou status. It also separates the positive earners from the zero earners. The average gross annual wage and income of agricultural hukou holders are less by 6,251 and 6,914 yuan, respectively. These amounts are almost half of the wage and income of nonagricultural hukou holders. The difference is higher in incomes than in wages. The wage gap increases when comparing only positive wages after excluding zero earners but declines when comparing income gap among positive income earners. This indicates that agricultural hukou holders earn more from nonwage sources than nonagricultural hukou holders. Therefore, hukou-based earning discrimination exists in the Chinese society. Looking at the details of the results, regressing hukou registration on wage in column (3.1) while controlling  Notes: The dependent variable is wage. Standard errors are robust (in parenthesis). *p < 0.1, **p < 0.05, and ***p < 0.01.

Wage analysis
for demographic characteristics shows that agricultural hukou holders earn 5,438 yuan less than nonagricultural hukou holders. This amount is almost 1,000 yuan less than 6,251 yuan in the bivariate estimate as is shown in Table 2. All the control variables have expected direction of effects, and they are statistically significant in column (3.1). Male Chinese earns more than female Chinese by a large margin, which is consistent with recent wage estimates in China (Asadullah and Xiao, 2019). Older people earn more, but it is not consistent while adding more controls in the model. However, we find a significant and inverse U-shaped relationship between age and earnings. 23 Higher BMI reduces wages for Chinese people but is not statistically significant consistently. Heath matters for wage earnings (Schultz, 2002). We find that, based on self-reported health status, people with poorer health earn significantly less, which is also consistent with other Chinese estimates (Chuanchuan, 2011).
All the model specifications in Table 3 control for Chinese provinces and major religions.
There are observations from 25 provinces in China and controlling them should account for regional disparities including living standard, cost of living, regional economic development, and so on, which are essential, while regional inequality in China is one of the highest in the world (Fleisher et al., 2010;Zhang and Kanbur, 2005). The reference province is Beijing; people in most provinces such as Shandong, Hunan, Gansu, and so on earn significantly less compared to people in Beijing. On the other hand, in provinces such as Shanghai, Zhejiang, and Inner Mongolia, people's earnings are not significantly different from the earnings of people in Beijing (note that the coefficients of regional dummies are not reported here). For the religion dummy, there are seven different religious groups such as Buddha, Taoist deity, Muslim, Catholic, Jesus Christ, Ancestor, and others without religious beliefs. Buddha is the reference religion; it shows that there are no significant earning differences based on religious identity except Muslim people who earn significantly less than Buddhist people. This is consistent with earlier evidence that Muslims face systematic discrimination in their social, economic, and political rights from Han Chinese who are predominantly non-Muslims (Chuah, 2004;J. N. Smith, 2002).
An individual's ability can potentially determine both of his/her income and hukou status. Abilities often measured as schooling and different forms of ability test are widely proven important determinants of earnings (J. Angrist, 1998). It is also true for hukou status since China's most recent attempt to liberalize hukou has allowed rural hukou holders to apply for firsttier urban hukou based on a points system that considers educational attainment among other factors (A. Chen, 2018). Column (3.2) adds the most important list of control variables that measure ability such as years of schooling, training, cognitive skills, and memorizing capacity.
These additional controls drop the coefficient of hukou by approximately 2200 yuan, but hukou status stays highly statistically significant. People with more years of schooling and higher training earn more on average that is well documented in the literature too (J. Angrist, 1998).
Note that nondegree training rewards relatively more than education in the school since the coefficient of training is almost five times larger than the coefficient of years of schooling. This is because training is often directly related to the skills required for job performance; on the other hand, education is often very general and may not directly help to perform jobs well (Eck, 1993). Moreover, controlling for language and mathematical ability separately, which are part of 23 Note that we do not have good measure of experience in the CFPS 2014. That is why we control for age and age squared, which is a close substitute to experience. The result is consistent with most recent literature (Asadullah and Xiao, 2019;Smyth, 2013, 2015).
the education in school, may underestimate the true returns to education (Asadullah and Xiao, 2019, p. 90). A person currently attending school may also earn significantly more compared to a person not attending school. This is because while a person is employed and simultaneously enrolled in school, the educational program is likely to be related to job requirements. Highly mathematically able people earn more; however, people with better language and memorizing ability earn relatively less. The word and memorizing abilities do not have an expected direction of effects. The reasons might be that people with better language ability learned literature that did not help them earn much or that the test did not capture the true language ability. Similarly, memorizing ability measure was self-reported; thus, it might be the case that regional culture affected their self-reported statement as we see in Table 1 that agricultural hukou inheritors reported higher memorizing ability than nonagricultural hukou inheritors. 24 Column (3.3) includes controls for ambition and attitude-related variables; however, this additional controls increase the coefficient of hukou slightly. Measuring the actual ambition level for an individual is difficult. However, our dataset contains some variables that directly and indirectly represent how an individual is strongly ambitious by asking questions: how confident are you in your future and how good are you in maintaining interpersonal relations or networking. Controlling these variables allowed me to capture many personality characteristics that should matter for earning differences as well as whether to change hukou status.
The results show that being good at networking or maintaining interpersonal relationships help Chinese individuals earning higher wages, but a higher level of confidence with future life affects them negatively. Theoretically, it does make sense that higher networking ability should be positively associated with earning (Pietro, 2007), but it does not make sense that higher confidence with the future is negatively associated with wages. 25 Instead, it may be that some people are temporarily in unexpected conditions and currently they are earning low wages, but they are confident to recover in the future is dominating the effect. Attitude is another important part of one's personality. Evidence about the impact of an individual's personality traits or attitudes on earnings in the literature is heterogeneous depending on the dimensions of attitudes (Heineck, 2011;Heineck and Anger, 2010;Semykina and Linz, 2007). individual's earning and other achievements (Datcher, 1982;Ermisch and Francesconi, 2001;Loury, 1977;Meghir and Palme, 2005). I added controls for individual family and socioeconomic 24 We did not control for occupation as it may disturb the coefficient of hukou if the occupation is an intermediary variable between hukou and earning (Angrist and Pischke, 2008). In another word, if an individual engages in agricultural farming because his/her hukou status requires, I should not control for occupation in the regression analysis, as it may be a bad control. 25 Because empirical evidence such as Filippin and Paccagnella (2012) have shown that a small difference in self-confidence can lead to gaps in human capital and economic outcomes. background differences, represented by nine different variables covering various aspects of family and socioeconomic background, in column (3.4). Most of them are statistically significant predictors of earning. These controls increase the coefficient of hukou by a large margin of more than 1,500 yuan. Marriage has a wage premium -empirical evidence shows that married workers tend to be in higher paying job grades; they receive higher performance ratings than single men; thus, married men are more likely to be promoted (Korenman and Neumark, 1991). On the other hand, divorce and separation have adverse economic consequences particularly on women (Duncan and Hoffman, 1985). Therefore, I controlled for marital status, and it shows that being in a marital relationship accounts for a substantial wage difference, more than a 1,100 yuan that is approximately 17% of the average wage gap between the two hukou groups.
On the other hand, spending more time on household works and watching TV are associated with lower earnings. This is in connection with the pervasiveness of income-leisure tradeoffs in the literature (Battalio et al., 1981;Bielby and Bielby, 1988).
CCP members earn more than non-CCP members. Becoming a CCP member can bring a lot of advantages in the Chinese community and thereby should influence an individual's wage earnings. 26 The dataset allows me to control for this important characteristic of Chinese individuals, and it shows that CCP membership accounts for a substantial amount of earning differences, more than 1,300 yuan. Classical economists well understood that individuals are motivated at least partly by concerns about relative position. A large volume of empirical research demonstrates the relationship between relative position and well-being (Frey and Stutzer, 2002). 27 We control for the relative social status of family and individual him/herself.
While the higher relative social status of an individual is positively associated with wages, the higher relative status of a family does not influence wages. If a person reported that he/she faced unfair treatment due to inequality of his/her personal wealth or experienced any unfair treatment from the Chinese government did not matter for differences in their wage earnings. The dataset also allows me to control who is in retirement. Retired people should earn a pension, not wages, but there are some people who took another job after retirement from their main profession. Results show that people who are retired earn less than those who are not retired.
Migrant Chinese earn more than the local Chinese residents (column 3.5). Note that these migrants can be from any possible four types of migrants: rural-urban, urban-urban, urbanrural, and rural-rural. The dataset shows that only 17% of people are in migrant status: 7% of them carry rural hukou, while the other 10% carry urban hukou. It is expected that rural-urban migrants are vulnerable and earn less than the local resident, which many earlier literature confirmed (Frijters et al., 2009;Gagnon et al., 2012;Lee, 2012). However, we cannot confirm this hypothesis from these findings as more migrants in this analysis hold an urban hukou and do not face considerable discrimination while living in a different city. Migrants who are carrying a rural hukou can be in either another rural or urban area. I considered interaction of migration status with hukou in column (3.5), which checks if an agricultural hukou holder becomes migrant and the coefficient of the interaction term is positive. This means migrants 26 Bian et al. (2001) found that party membership is positively associated with mobility into positions of political and managerial authority during the post-1978 reform era. Walder (1995) further explained that when a member of CCP has both educational and political credentials, he/she could have administrative posts with high prestige, considerable authority, and clear material privileges. 27 Luttmer (2005) found that, controlling for an individual's own income, higher earnings of neighbors are associated with lower levels of self-reported happiness. McBride (2001), Ferrer-i-Carbonell (2005, and Blanchflower and Oswald (2004) found tantalizing evidence that relative income affects subjective well-being. earn more on average than local residents. This is interesting because if migrants earn more, then why more people do not migrate. The possible answer is migration in China is associated with many explicit and implicit costs (see Zhao, 1999, p. 777).

Income analysis
Another indicator of living standard is income where income can be either wage or nonwage such as capita earnings. Table 4 reports the OLS estimates of the effect of hukou registration on income. I used the same specifications here as was in Table 3. The overall results from Table 4 show that the agricultural hukou holders earn significantly lower incomes compared to nonagricultural hukou holders. This is equivalent to what we have seen in Table 3 This is certainly good news for the regression result because this estimation has reasonable sets of controls to potentially avoid omitted biases. Econometricians tend to address this strong association as causal when adding and dropping some variables do not change the conclusion for key interested coefficient (J. D. Angrist and Pischke, 2008). One may be concerned about the low adjusted R-squared value of 23 for wage and 24 for income; note that other related literature that studied the impact of hukou also found low R-squared value ranging approximately from 15 to 25 (Q. Deng, 2007;Liu, 2005;Song, 2016). Note that the log-normal equation has an improved R-squared with a similar conclusion; however, we did not report this result as we believe that this result is less biased as discussed early in this paper. We cannot fully rely on this estimation because of selection biases as explained before. Thus, I move to PSM nonparametric estimation in the next section to estimate causality.

Wage and income analyses for all samples
We need to specify the propensity scores of adopting a hukou before estimating the impact of hukou registration on earnings nonparametrically. I use a logit model, which is the most preferred model (Rosenbaum, 1986;H. L. Smith, 1997), to predict the probability to adopt a hukou type and include different individual-, family-, and society-level characteristics as regressors.
There is differentiated advice in the literature regarding the inclusion of control variables in the PSM. While Rubin and Thomas (1996) suggested against "trimming" models in the name of parsimony, Sianesi (2004) and Smith and Todd (2005) recommended that the selection of

Notes:
The dependent variable is income (wage + non-wage earnings). Standard errors are robust (in parenthesis). *p < 0.1, **p < 0.05, and ***p < 0.01. OLS, ordinary least square; CCP, Chinese Communist Party. Table 4 Continued covariates should be grounded on the theory that relates covariates to outcomes and treatment. Note that omitted variable bias produces inaccurate propensity scores (Baser, 2006, p. 379). Inclusion of variables that are weakly related to treatment (hukou adoption) usually reduces bias more than it increases variance when using matching, so under most conditions, these variables should be included (Heckman et al., 1997;Rubin and Thomas, 2000). Mendola (2007) had deployed the largest set of variables to estimate the effects of technology adoption on household well-being so that it makes less likely that unobservable characteristics remain out of the matching process. Both perspectives have merits although higher supports are for the inclusion of more variables even though they are weakly related.
In the logit model, I aimed at making agricultural and nonagricultural hukou inheritors more comparable based on scores that are built on several criteria. I took both the above perspectives carefully into consideration by selecting as much as covariates available in the dataset that have a theoretical connection to wage/income, hukou adoption or both. In Table 1 in section 3.3, we saw that most variables differ between agricultural and nonagricultural hukou people suggesting the absolute or relative subsistence or stoppage pressure of adopting nonagricultural hukou (or retaining nonagricultural hukou). Note that I only considered those demographics, ability, family, socioeconomic, and other variables that have theoretical connection to hukou status and earnings, and I saw that most of them are significantly related to earnings in section 4.1. I applied identical specification for logit model except that I dropped age squared (specification 1) and added employment variable while estimating impact of hukou adoption on income for the same reason explained in section 4.2 (specification 2). There is no science about what is the best model for propensity estimation. However, Rosenbaum and Rubin (1984) recommended an iterative approach for achieving covariate balance, and Diamond and Sekhon (2012) suggested that it should aim at maximizing the balance of covariates without limits. So, I have a third specification that includes the sector of works, relative income in the local area, and others (specification 3). They are important variables in the matching process because they differ by hukou types. So, specifications 1 and 2 are more parsimonious than specification 3, but they are useful in order to check the consistency of the causal impact (Smith and Todd, 2005). Note that I tried with several other specifications by adding and dropping some variables, but I choose these specifications as they are grounded on theories and maximize the balance of covariates (Diamond and Sekhon, 2012). Note that other specifications did not change the conclusion.
The results of the logit estimate of propensity scores are reported in Table A1 in the appendix. Most of the variables included have the expected signs, and they are mostly statistically significant. Migration, a response to income and nonincome differentials, is probably the most important variable that influences individual Chinese citizens whether to change their agricultural hukou to nonagricultural hukou (Mundlak, 1979). Migrants have already overcome parts of the psychic and nonpsychic costs of hukou conversation (Sjaastad, 1962). The result shows that migrants are less likely to keep their agricultural hukou, which is consistent with earlier studies in China (Zhao, 1999). All demographic characteristics are statistically significant except for self-reported health status. Females, younger people, and low BMI people tend to adopt more nonagricultural hukou. Younger people tend to migrate more to urban areas, and thus, they adopt more nonagricultural hukou (Zhao, 1999). More females may change their status through marriages as marriages let them to relocate to their husband's hometown and take the husband's hukou status. Similarly, years of schooling, training, and math and word test scores that measure different types of abilities lead people to adopt nonagricultural hukou except memorizing ability. Schooling plays the most influencing role since it provides information advantage of job searches in different locations (Schwartz, 1973) and it reduces the psychic cost of migration (Sjaastad, 1962). This is consistent with what we have learned early in this paper that most nonagricultural hukous are rewarded based on different types of ability such as education.
While higher confident people keep their agricultural hukou, people with better networks adopt nonagricultural hukou. People who trust their neighbors and consider a larger number of children as ideal tend to keep their agricultural hukou. For people who want more than one child, keeping their agricultural hukou was a big incentive, as in urban areas, the one-child policy was implemented strictly, while it was relaxed in rural areas (J. Zhang, 2017). That is why we see that the coefficient of the ideal number of children is relatively large and positive. Other attitudinal variables are not statistically significant. People who spend more time on household work tend to adopt agricultural hukou, but people who spend more time watching TV adopt nonagricultural hukou. Both individual and family social status are positively associated with agricultural hukou since relative status is probably more recognized in rural areas than in urban areas (Reiss, 1959). On the other hand, having a membership of CCP lead people to adopt nonagricultural hukou since it is much easier for CCP member than for non-CCP member to convert their status (Walder, 1995). Similarly, people who are in retirement tend to adopt more nonagricultural hukou since it is associated with higher pension and other benefits than agricultural hukou (London, 2014). Other socioeconomic features such as employment status, number of deceased siblings, and current school enrolment are also significantly associated with hukou types.
We should be less worried about which variables should be in the model of PSM when it satisfies some properties/tests such as balance check, bias reduction, and density of propensity score before and after matching (Diamond and Sekhon, 2012). The propensity scores only serve as a device to balance the observed distribution of covariates across agricultural and nonagricultural hukou groups. Therefore, the resultant balance assesses the reliability of the PSM application (Diamond and Sekhon, 2012;W. S. Lee, 2008). The common support conditions are imposed, and the balancing property is set and satisfied in all estimations at 1% level. The matching procedure is performed in the region of common support (Leuven and Sianesi, 2018). Figure 1 shows the distribution of the propensity scores and the region of common supports for all three specifications. It clearly reveals the significance of good matching, as well as the imposition of the common support condition to avoid bad matches.
While good balance is required for successful matching estimation, there is no guarantee that matching improves balance. The PSM application may make balance worse, even if covariates are distributed ellipsoidally because in a given finite sample, there may be departures from an ellipsoidal distribution (Sekhon, 2011). Therefore, to warrant the validity of the PSM analysis, it is pertinent to check the balance of covariates and the distribution of propensity score for agricultural and nonagricultural hukou before and after matching. Figure 2 reports the standardized percentage bias across selected covariates by graphs and a histogram. 28 It shows 28 Both graphs and the histogram in Figure 2 are based on specification 1. They are identical under specifications 2 and 3. that matching has clearly improved the balance of covariates and minimized the bias. The bias of matched covariates is all within an acceptable threshold, while unmatched covariates have a significantly larger bias. Similarly, the kernel density plots in Figure 3 show that there is a large difference in the density of the two groups before matching and the difference is disappeared after matching in all three specifications. The matched samples are almost indistinguishable between agricultural and nonagricultural hukou groups, and they overlap entirely. This is  certainly good news for this estimation as these tests indicate a good balance of covariates between the two hukou groups. 29 I estimated the impact of hukou registration on wage and income by two different methods: NNM and caliper/radius matching. The robustness of the effect was verified using different number of neighbors to match (1 and 3) and radius (caliper of 0.01 and 0.05), and I found that the effect is not sensitive in terms of statistical significance, although it changes the size of 29 For specification 1 for whole-sample analysis, the total number of observation was 27,212 (model 5.1 in Table 5). Out of these, on-support untreated numbers were 7,785 and treated numbers were 4,488 and off-support untreated numbers were 0 and treated numbers were 14,939. Similarly, for specification 2, the total number of observation was 27,196. Out of these, on-support untreated numbers were 7,777 and treated number were 4,484 and off-support untreated numbers were 0 and treated numbers were 14,935. For common support graph, psgraph, and graph and histogram, pstest command were used in Stata. Notes: Standard errors are Abadie-Imbens' robust standard errors (in parenthesis). *p < 0.1, **p < 0.05, and ***p < 0.01. While specification 1 was used for wages, specification 2 was used for income in models (5.1) and (5.2). For all others, specification 3 was used. NNM, nearest neighbor matching; ATE, average treatment effects.
earning differences slightly. 30 The results are presented in Table 5. We also see that the earning difference in income was larger than wage, which is consistent with the earlier OLS estimates in Tables 3 and 4. This is the average difference in earnings between a similar pair of individuals who belong to different hukou registration estimated as ATE. The PSM estimate for differences in wage and income is relatively conservative compared to the bivariate estimate in Table 2. While the bivariate estimate was 6,251 and 6,914 yuan in wage and income, respectively, the PSM estimate was approximately 4,000 to 4,300 yuan in wage and 4,700 yuan in income depending on the methods of matching. So, the difference is 30 I used teffects psmatch command in Stata as it has an important advantage over psmatch2 command written by Leuven and Sianesi (2018) since propensity scores are estimated rather than known when calculating standard errors. This allowed me to calculate correct robust standard errors of Abadie and Imbens (2012), which offer an important correction to the standard errors of a sample mean when missing data are imputed using the "hot deck". Abadie and Imbens (2012) derived a method to estimate the standard errors of the estimator that matches on estimated treatment probabilities, and this method is implemented in teffects psmatch.  Tables 3 and 4. Therefore, the OLS estimate overestimated the true effect of hukou registration for income.

Wage and income analyses by different groups and conditions
The CFPS data allowed me to investigate the earning disadvantages resulting from hukou registration types within work ownership, types of work, types of employers, and labor contract conditions. This further disaggregation provides many stimulating understanding of the consequence of the hukou registration system on the earnings of Chinese individuals. There are adequate representations of people of each hukou type in each group of work ownership, types of work, types of employers, and labor contract conditions as shown in Table 6 Figure 4 confirms that there are good balances between agricultural and nonagricultural hukou people across covariates after matching. The agricultural hukou group is almost indistinguishable from the nonagricultural hukou group after matching, while there was a big variance before matching for each subsample. So, by matching based on the estimated propensity score, the comparability improves evidently within different groups by hukou types, which is the same as what we see in Figure 3.
While Table 7 reports the estimated causal impact of hukou registration within types of work ownership and types of work, Table 8 reports the impact within types of employers and labor contract conditions. We find convincing evidence of hukou-based earning discrimination against agricultural hukou people. Considering work ownership, in the survey, people were asked to answer, "do you work for yourself/family or are you employed by others/ organizations/units/companies?". Their answer differentiated them as either self-employed if they answered, "work for myself/family" or employed by others if they answered, "employed by others/organizations/units/companies". As it is expected, earning discriminations should exist only among the people who are employed by others but certainly not among those who are self-employed. This is because those discriminated against in the labor market should have a greater incentive to enter self-employment as discrimination lowers the expected wage in the labor market and thus lowers the opportunity cost of self-employment (Coate and Tennyson, 1992 and there is a good match within the radius of 0.05. Moore (1983) predicted that self-employment was a method to avoid racist (or sexist) employment practices and should result in a higher black/white (or female/male) earnings ratio among the self-employed workers than their wage and salary counterparts. Therefore, taken together (results from models 7.1 to 7.4), higher earning for agricultural/non-agricultural hukou people while self-employed and lower earning for agricultural/non-agricultural hukou people while employed by others confirm Moore's prediction about employer discrimination.
In the Chinese context, an alternative interpretation is that the agricultural hukou people earn more because there are unique access and benefits that are tied to rural hukou including farming contract land, housing land, and compensation for land requisition (Lu and Song, 2006).
They are considered valuable assets to which urban hukou people do not have access. Thus, when rural hukou people engage in independent business or self-employment, they do better than their urban hukou counterparts because of their initial endowment resources. Overall, earning differences within work ownership provide strong evidence of employer discrimination and self-employed people are able to avoid it.
In models 7.5 to 7.8, I also found suggestive evidence of hukou-based discrimination against agricultural hukou people across both agricultural and nonagricultural works. People  were asked: "Is your job an agricultural job or a nonagricultural job?". From their answers, I can further investigate hukou registration impact by types of work. As explained in the questionnaire, an agricultural job includes works related to forestry, stock farming, fishing, and other sideline productions. Anecdotally, we can expect that agricultural hukou holders should face less or no earning discrimination in agricultural jobs as opposed to nonagricultural jobs.
The results in Table 7 show that, while agricultural hukou holders face earning disadvantage in both agriculture and nonagriculture-related professions, the size of earning disadvantage is higher in nonagriculture-related professions than in agriculture-related professions. In the nonagriculture-related jobs, the earning difference is 2,924 yuan in wage and 2,714 yuan in income, and in the agriculture-related jobs, the earning difference is approximately 2,218 yuan in wage and 2,295 yuan in income. Note that result for income is not statistically significant, which gives an indication that nonwage earning dominates the impact here. This further indicates the institutional discrimination, and agricultural hukou people engage more in nonwage earning to avoid this institutional discrimination. However, whoever with an agricultural hukou stays in waged job cannot avoid discrimination. In other words, nonagricultural hukou people are a privileged group not only in the non-agriculture-related professions but also in agriculture-related professions. Thus, overall results in Table 7 tell us that earning discrimination is not a random incident rather it occurs systematically to agricultural hukou inheritors as we understand the issue that they face discrimination when others employ them but not when they are self-employed, and it exists in both agricultural and nonagricultural professions. is 807 yuan in wage and 1,167 yuan in income when they work in private enterprises, but this earning difference is not statistically significant (models 8.3 and 8.4). These are not surprising results because CCP has been continuously pursuing economic policies since Chinese economic reforms started, which are extremely biased toward urban people. These urban-biased policies are even strongly enforced in SOEs where people with rural hukou have minimal access to any top positions (Yang, 1999(Yang, , 2002. Therefore, it seems that there is severe earning discrimination in government jobs and privatization reforms have potentially reduced the earning gaps between the two hukou groups through competition. The finding is consistent with some of the existing Chinese literature; for example, Song (2016) concluded that for observationally equivalent workers, agricultural hukou people earn 50% less than nonagricultural do in the SOE, but only 5% less in the private sector. It is also supportive of the prediction of Becker (1957) and Arrow (1973) that the employer discrimination is less intense under competition because nondiscriminators can gain a competitive advantage by substituting lower cost agricultural hukou people for relatively higher cost nonagricultural hukou people, ceteris paribus. It differs from the hypothesis of Long (1975) that the public sector is nondiscriminatory.
Another important condition where impacts of hukou status should differ and provide evidence of institutionally designed discrimination is the signing of a labor contract. Signing a labor contract represents particular types of workplace environments that are expected to be meaningfully different from the workplace environments where signing a labor contract does not apply. In the CFPS survey, people were asked: "Do you sign a labor contract for this job?". They responded as "yes/no" as well as "not applicable". The labor contract condition is a contract between labor and owner/management governing wages and benefits and working conditions that may or may not favor employees and employers. On the other hand, the "not applicable" option implies different workplace environments such as informal jobs, selfemployment, an entrepreneurial venture, and a person running his/her own business. There is empirical evidence that a better understanding of a labor contract is associated with improved workers' satisfaction. However, a labor contract comes with reduced wages and sometimes coexisted with right violations against workers (Z. Cheng et al., 2014). It is expected that earning discrimination should exist only where labor contract is applicable because of biased institutions, legal systems, contracts, and others, which may favor high-class urban hukou people and discriminate against low-class rural hukou people. On the other hand, in a workplace where a labor contract is not applicable, people tend to be more independent and have fewer constraints, and thus less chance of earning discrimination by hukou status. Results in Table 8 are consistent with our expectation that if a labor contract is applicable, agricultural hukou people earn significantly less than nonagricultural hukou people. The ATE of hukou status is approximately 2,604 yuan in wages and 2,196 yuan in incomes (models 8.5 and 8.6). On the other hand, if a labor contract condition is not applicable, agricultural hukou people indeed earn more than nonagricultural hukou people in wage although they are equal in income (models 8.7 and 8.8). Note that this is not the impact of a labor contract but rather the impact of different employment and workplace environments where people are discriminated against based on their hukou identity. Thus, we observe further signs of institutionally enforced discrimination where people with nonagricultural hukou are offered better wages at the expense of agricultural hukou people's wages.

Conclusions
This study focuses on estimating earning discrimination against agricultural hukou people in China. Chinese citizens with agricultural hukou are found to be treated unfavorably in terms of both wage and nonwage earnings. Both parametric and nonparametric estimations reveal that the hukou-based earning gap is overwhelmingly generated not only between the two hukou types overall but also within sectors, job categories, employer categories, and labor contract conditions. The results in Tables 7 and 8 provide strong evidence that the earning difference between the two hukou groups is earning discrimination by hukou-based identity. To believe this is earning discrimination, results in Tables 7 and 8 for subsamples are more compelling than the results in Table 5 for all samples. Evidence of earning differences within work ownership, work types, employer types, and labor contract conditions gives a clear signal of systematic discrimination. Such disaggregation should eliminate the doubts that the earning disadvantage for people with agricultural hukou is due to rural-urban segregations rather than hukou identity. Alternatively, if the earning disadvantage for agricultural hukou people is due to rural-urban segregation, people should earn equal within the same occupation and people should change their status and join the high-earning groups. However, there is a systematic barrier, and this barrier is institutionally cultivated in China. The full answer is certainly more than just a hukou system, but hukou is an important one.
The consequence of this earning discrimination in the long run against a particular group of people is more than just widening income and social inequality. It is an unfair system and a major barrier toward a modern welfare state. Earning, either wage or nonwage, is the foremost spine of maintaining the standard of living. While a group of people can earn high, at the same time, another group cannot in the same society and in the same occupation; the sufferings of the disadvantaged group multiply through market forces such as the cost of living. The only exception is if some form of social protection programs protects these low-earning people, but evidence shows that agricultural hukou holders have lower access to social security programs, particularly when they are not in their hukou location (Nielsen et al., 2005;Xu et al., 2011). This finding has many policy implications. Considering that the hukou system has many advantages in the Chinese society, if the Chinese government does not want to abolish this old system, it should focus on reducing hukou-based discrimination in the labor market, particularly in the government organization. City government should stop granting urban hukou only to the super smart and rich people. Rather, it should be accessible on the basis of needs; otherwise, human capital divisions between rural and urban areas will worsen.
Although this paper provides improved evidence of earning discrimination against agricultural hukou people, it is not without limitations. This is self-reported data by individual adults; thus, it does not capture the underground economic activities (Becker, 1968). It may be that rural and urban hukou people engage in underground economy differently that may lead them reporting lower wages/incomes. There is also a possibility that people may hide their identity to avoid legal restriction or discrimination (Deng and Cordilia, 1999;Dutton, 1997;Wang, 2004), and thus, they are less likely to appear on the social survey. Notes: Standard errors are robust standard errors (in parenthesis). Dependent variable is hukou status: agricultural hukou = 1, otherwise = 0. Coefficient should be interpreted as odds ratio. *p < 0.1, **p < 0.05, and ***p < 0.01. BMI, body mass index; CCP, Chinese Communist Party.